Final Harrison Bus 391 Study Guide
2024-2025 complete update
Relational Database •Tables of records
•Link field in one table to field in another table
•Separates data from paths to retrieve data
Who Invented Relational Databases? E. F. Codd, a British mathematician working at IBM,
What foes Relational Database organize? organizes information into tables of records that are
related to one another by linking a field in one table to a field in another table with matching
data. The approach separates the data from the paths to retrieve them, thus making the database
less dependent on the hardware and its particular operating system. His invention eventually
came to dominate the field.
Example of Relational Database The first table shows the student ID, last name, first name,
and birth date. The second table shows student registrations with fields that display student ID,
class code, and grade. Note that student ID is also included in this table, which makes it possible
to link records in the two tables together.
Data Model for Relational Databases entities and attributes, primary keys and uniqueness, and
normalization.
•Each entity represented in the model will have attributes, or fields, that describe the entity. For
example, "Employees" is a straightforward entity with attributes such as employee ID number,
last name, first name, birth date, email address, and phone number. The entity and its attributes
(or fields) will become a record, and a collection of records will become a table.
How many primary keys can each record have in a table? •Each record in a table must have
one primary key, which is a field or group of fields that makes the record unique in that table.
What is Normalization? Normalization is the process of refining entities and their
relationships, and helps minimize the duplication of information in the tables. For example, in
the employee table, one goal of normalization is to make each attribute functionally dependent
on the employee ID number, which uniquely identifies each employee. Functional dependence
means that for each value of employee ID, there is exactly one value for each of the attributes
,included in the record, and that the employee ID determines that value. There would be just one
employee email address, one birth date, one last name, one first name, and one department
Data Warehouse Another integration strategy. The data warehouse is a central data
repository containing information drawn from multiple sources that can be used for analysis,
intelligence gathering, and strategic planning.
What is the process to build a data warehouse? extract, transform, and load (ETL). The first
step is to extract data from the sources, and then transform and cleanse it so that it adheres to
common data definitions. After transformation, the data is loaded into the data warehouse,
typically another database. At frequent intervals, the load process repeats to keep it up-to-date.
Benefits of data warehouse Building a data warehouse from operational databases and adding
some external sources are manageable for most organizations and extremely useful. Consider
also how much semistructured and unstructured information flows through Twitter, YouTube,
Facebook, and Instagram, some of which could give the company a competitive advantage if
analyzed quickly. The amount of data available is also exploding because so much is gathered
automatically— by sensors, cameras, RFID readers, and mobile devices. Big data refers to
collections of data that are so enormous in size, so varied in content, and so fast to accumulate
that they are difficult to store and analyze using traditional approaches.
How to manage the Database? •Performance tuning and scalability
•Integrity, security, and recovery
•Documentation
Database The database needs tuning for optimal performance, and the tuning process takes
into account the way the end users access the data. Optimizing performance for speedy retrieval
of information may require slowing down other tasks such as data entry or editing. Scalability
refers to a system's ability to handle rapidly increasing demand, and this is another performance
issue.
Database Administrator (DBA) manages the rules that help ensure the integrity of the data.
The software can enforce many different rules, such as the referential integrity constraint, which
ensures that every foreign key entry actually exists as a primary key entry in its main table.
Database Management systems A database management system (DBMS) will provide tools
to handle access control and security, such as password protection, user authentication, and
access control.
Documentations of the Data model can be documented using a database schema, which
graphically shows the tables, attributes, keys, and logical relationships. The data dictionary
should contain the details of each field, including descriptions written in language users can
easily understand in the context of the business. These details are sometimes omitted when
developers rush to implement a project, but the effort pays off later.
Distributed Databases and Blockchain •Distributed Database Architectures
•Blockchain
,-Crypto currencies
-Walmart supply chain
Even supercomputers have their limits, so designers launch distributed databases in which all or
portions of the database are located on separate servers to distribute the processing loads and
improve performance. A distributed database may involve many servers so that large numbers of
customers, suppliers, and staff members can access the information they need in a timely way.
Strategy to replicate the database on several different servers One strategy is simply to
replicate the database on several different servers on a frequent schedule so that user traffic can
be directed to one of several different servers. The systems may be in the same data center, but
they might also be located on different continents to better support users across the globe.
Another strategy used to accommodate increasing loads is to fragment the database so that
portions are stored separately on different servers. This way, each server only needs to respond
to queries that involve data stored on that server. Regardless of which architecture is used to
create a distributed database, the design must ensure the integrity of the data in case conflicts
arise, especially when adding or deleting data.
Distributed Aritchitecture - Blockchain which can be described as a type of open,
distributed ledger in which individual records, called "blocks," contain a time stamp and
reference links to previous transactions to prevent tampering and ensure transparency. The
ledgers reside on an open peer-to-peer network with no central control. After verification, blocks
of transactions are recorded on many servers and never erased, so anyone with access rights can
view them and submit new transactions
Blockchain is used to record transactions involving bitcoins, an online cryptocurrency that is
not issued by a government and instead is valued based on market demand. That blockchain was
designed to be public, so that people could submit transactions anonymously. This makes them
difficult to trace—a big plus for criminals.
Blockchain technology would revolutionize business and finance, making central
authorities such as banks and credit card companies unnecessary and offering a transparent,
trusted, and tamper-proof means for two parties to transact business. That has not happened, at
least not yet, but the technology is evolving, and many companies are launching blockchain
projects in partnership with suppliers. Walmart, for instance, launched one to create a more
transparent and trusted means to track fresh produce in its supply chain. Walmart controls
permission to record transactions, of course, so the blockchain is not a public one.
Human Element •Ownership
•Shadow Systems
•Data stewards
•Databases without boundaries
•Stakeholders
Data mangement focus mainly on a key area, such as customers, or on a limited number of
entities that are most important. Teams from across the company meet to identify the differences
and find ways to resolve them. Data stewards may then be assigned as watchdogs and bridge
, builders to remind everyone about how data should be defined. Master data management has less
to do with technology than with people, processes, and governance.
Company may set the policy that all information resources are company-owned in practice,
people often view these resources more protectively, even when compliance and security don't
demand tight access controls. Norms about how records are used emerge over time, and although
many are unwritten, these norms can certainly affect employees' behavior. Salespeople may want
to protect access to their own sales leads, or whole departments
might want to control who has access to the records that they maintain. They may prefer that
employees outside the department have the right to view one of "their" records but not change it.
Shadow System Although the integrated enterprise database is a critical resource, changes
to support new features can be painfully slow. People want to get their jobs done as efficiently as
possible, and sometimes the quick solution is Shadow System
These are smaller databases developed by individuals or departments that focus on their creator's
specific information requirements. They are not managed by central IT staff, who may not even
know they exist. Shadow systems are easy to create with tools like Access and Excel, but the
information they hold may not be consistent with what is in the corporate database. Another
hazard is that the department may be left hanging when the creator leaves because no one else
knows quite what the shadow system does.
Master Data Management (MDM) A broader strategy to address underlying inconsistencies in
the way people use data.
This effort attempts to achieve uniform definitions for entities and their attributes across all
business units, and it is especially important for mergers. The units must agree on how everyone
will define terms such as employee, sale, or student.
Data Stewards Data stewards may then be assigned as watchdogs and bridge builders to remind
everyone about how data should be defined.
databases without boundaries Another example of how the human element interacts with
information management.
in which people outside the enterprise enter and manage most of the records. These contributors
feel strong ownership over their records. A valuable lesson from the efforts to build databases
without boundaries is simply the need to plan for high volume, rapid growth, and potential abuse.
Top-level management needs strategic information and insights from big data along with
accurate, enterprise-wide reports to balance the information needs of many stakeholders.
Operating units must have reports on transactions that match their operations, and they need
information systems that are easily changed to support fast-moving business requirements.
Processing •Central processing unit (CPU)
•Transistors