IB/Group 4/Computer Science/Databases/Basic concepts
Basic concepts[edit | edit source]
Computers store data. Data can be any one of several different types (e.g. numeric, text, Boolean, etc.) but has no intrinsic meaning to a human. Data becomes information when it is put into a context that gives it meaning.
For example: 32 23 11 08 40 17 is data, but it has no meaning.
If we provide a context for that data, it becomes information, e.g.:
- The home and away scores for 6 soccer teams last Saturday
- The temperatures in degrees Celsius for 6 cities around the world at mid-day today
- The ages in years of the last 6 people to walk through the turnstiles of the Eiffel Tower in Paris.
Thus: Information = Data + context
Strictly speaking, databases store data, not information. However the terminology is commonly used loosely as there is an assumption that data stored in and retrieved from a database is interpreted by human beings within the scope of an appropriate context.
"An information system (IS) is any organized system for the collection, organization, storage and communication of information.
"An information system (IS) is a group of components that interact to produce information."
"..information systems .. are made up of six components: hardware, software, data, people, network, and process."
Information system is therefore a wider term than database as it refers to a complete system, although many (if not most) information systems include one or more databases as part of their overall structure.
There are some unique problems if not using the database:
1. Size of Data: The small amount of data storing into spreadsheet is fine, however it might turns into a large amount of data then Spreadsheet solution will not work. Even if the size of data records goes into millions then storing data in multiple spreadsheet which will create a problem of speed. It will take you long time to find a record from the multiple spreadsheet files.
2. Ease of Updating Data: Multiple peoples cannot edit the same file on same time. Other peoples must wait until files are available to update which results into wastage of time.
3. Accuracy: When user doing data entry in files then it might be possible to incorrect data due to no validation present like you can enter wrong spelling, wrong dates, and wrong amount. So the Data accuracy is hard to maintain and accuracy is in question.
4. Security: You cannot secure the data in the text files and spreadsheet. Anyone can access the file and read any data present in the file. So storing data will not work with banking, healthcare application, payroll department where privacy is difficult to maintain.
5. Redundancy: The duplication of data can be possible using text files or spreadsheet. Chances of adding multiple copies of data cannot be limited here. This will leads to accuracy issues. Maintaining and updating multiple copies is not an easy task.
6. Incomplete Data: Some of the data is not considered not important, so such data not entered in the file as no validation in place which leads the data integrity is in question.
To prevent above problem associated with storing data in the text file or spreadsheet the database is required.
A transaction is a unit of work that is performed against a database. Transactions are units or sequences of work accomplished in a logical order, whether in a manual fashion by a user or automatically by some sort of a database program.
A transaction is the propagation of one or more changes to the database. For example, if you are creating a record or updating a record or deleting a record from the table, then you are performing a transaction on that table. It is important to control these transactions to ensure the data integrity and to handle database errors.
For example, to ensure data consistency when moving money between two accounts it is necessary to complete two operations (debiting one account and crediting the other). Unless both operations are carried out successfully, the transaction will be rolled back.
Consistency, in the context of databases, states that data cannot be written that would violate the database’s own rules for valid data. If a certain transaction occurs that attempts to introduce inconsistent data, the entire transaction is rolled back and an error returned to the user.
Data integrity is the overall completeness, accuracy and consistency of data. This can be indicated by the absence of alteration between two instances or between two updates of a data record, meaning data is intact and unchanged. Data integrity is usually imposed during the database design phase through the use of standard procedures and rules. Data integrity can be maintained through the use of various error-checking methods and validation procedures.
A transaction, in the context of a database, is a logical unit that is independently executed for data retrieval or updates. In relational databases, database transactions must be atomic, consistent, isolated and durable--summarized as the ACID acronym.
Concurrency control is a database management systems (DBMS) concept that is used to address conflicts with the simultaneous accessing or altering of data that can occur with a multi-user system. If two or more users try to update the contents of a database simultaneously, locks and partitions are put into place to prevent it. Thus enabling greater concurrency.
ACID refers to:
- atomicity ensures that all operations within the work unit are completed successfully. Otherwise, the transaction is aborted at the point of failure and all the previous operations are rolled back to their former state.
- consistency ensures that the database properly changes states upon a successfully committed transaction.
- isolation enables transactions to operate independently of and transparent to each other.
- durability. ensures that the result or effect of a committed transaction persists in case of a system failure.
Query functions A query is a request for data or information from a database table or combination of tables. This data may be generated as results returned by Structured Query Language (SQL) or as pictorials, graphs or complex results, e.g., trend analyses from data-mining tools.
ADD, DELETE UPDATE are all functions that can be performed on the database
UPDATE Query is used to modify the existing records in a table. You can use the WHERE clause with the UPDATE query to update the selected rows, otherwise all the rows would be affected.
Data verification is a way of ensuring the user types in what he or she intends, in other words, to make sure the user does not make a mistake when inputting data. ... Validation is about checking the input data to ensure it conforms with the data requirements of the system to avoid data errors. Data validation is the process of ensuring that data is valid. Data validation rules are used in data validation processes to ensure the validity. The term validity of data mostly denotes the meaningfulness and correctness of the data. Data Validation is done on the original document whereas data verification is done on copies. This is the major difference between data validation and data verification.