4. Data Modeling
data modeling is the process of defining real world phenomena or geographic features of interest in terms of their characteristics and their relationships with one another
it is concerned with different phases of work carried out to implement information organization and data structure
there are three steps in the data modeling process, resulting in a series of progressively formalized data models as the form of the database becomes more and more rigorously defined
conceptual data modeling --- defining in broad and generic terms the scope and requirements of a database
logical data modeling --- specifying the user''s view of the database with a clear definition of attributes and relationships
physical data modeling --- specifying internal storage structure and file organization of the database
data modeling is obviously closely related to the three levels of data abstraction in database design as noted in Section 3.1 above:
conceptual data modeling ----> data model
logical data modeling ---------> data structure
physical data modeling -------> file structure
4.1. Conceptual data modeling
entity-relationship (E-R) modeling is probably the most popular method of conceptual data modeling
it is sometimes referred to as a method of semantic data modeling because it used a human language-like vocabulary to describe information organization
it involves four aspects of work:
identifying entities
an entity is defined as a person, a place, an event, a thing, etc.
identifying attributes
determining relationships
drawing an entity-relationship diagram (E-R diagram) (Figure 21)
4.2. Logical data modeling
logical data modeling is a comprehensive process by which the conceptual data model is consolidated and refined
the proposed database is reviewed in its entirety in order to identify potential problems such as
irrelevant data that will not be used
omitted or missing data
inappropriate representation of entities
lack of integration between various parts of the database
unsupported applications
potential additional cost to revise the database
the end product of logical data modeling is a logical schema
the logical schema is developed by mapping the conceptual data model (such as the E-R diagram) to a software-dependent design document (Figure 22)
4.3. Physical data modeling
physical data modeling is the database design process by which the actual tables that will be used to store the data are defined in terms of
data format --- the format of the data that is specific to a database management system (DBMS)
storage requirements --- the volume of the database
physical location of data --- optimizing system performance by minimizing the need to transmit data between different storage devices or data servers
the end product of physical data modeling is a physical schema (Figure 23)
a physical schema is also variably known as data dictionary, item definition table, data specific table or physical database definition
it is both software- and hardware specific
this means the physical schemas for different systems look different from one another

--------------------------------------------------------------------------------

5. Process Modeling
process modeling is the process-oriented approach, as opposed to the data-oriented approach, of information system design
the objective is to identify the processes that the information system will perform
it also aims at identifying how information is transformed from one process to another
the end product of process modeling is a data flow diagram (DFD)
this implies that process modeling is by no means only concerned with process, it also deals with information organization and data structure
in the context of information system design, process modeling is one of the methods of structured business function decomposition used to determine user requirements in conceptual modeling
DFD is the principal modeling tool
a DFD is constructed using four basic symbols to represent process, data stores, entities and data flow in a business function (Figure 24)
process --- it represents the transformation of data as they flow through the system: data flow into a process, are changed, and then flow out to another process or a data store
entity --- the basic definition of an entity is similar to that for E-R modeling and it represents the initial source and final destination of data in a DFD
data store --- a temporary or permanent holding area for data
data flow --- the connection between processes and data stores along which individual entities or collection of entities flow
process modeling is a top-down analysis and design method
it results in a hierarchy of DFDs that represent a general-to-detail decomposition of processes (Figure 25)
a top level DFD, called the Level-0 DFD, typically contains a single process or a small number of processes that describes a business from a global perspective
this DFD is then decomposed into lower levels of DFDs (i.e. Level-1, Level-2, etc.) that provide progressively more detailed breakdown of business processes
the final DFD is used as the basis for information organization and data structure in the process-oriented approach to information system development

--------------------------------------------------------------------------------

6. Summary
this unit represents an overview, rather than a detailed explanation, of the principles and methods of information organization and data structure
the aim is to provide students with an articulate view of information organization from the conceptualization, through design and specification, to the practical implementation of data and file structures in information science and management
this enables students to understand how different processes in information system development, such as data modeling, database design and application development, are related to one another
information organization refers to the internal organization of data and event items in information systems
information organization is a key consideration in today''s data-oriented approach to system design and development; it is crucial to the functioning of information systems
information organization is largely conceptual in nature, and can be understood from four interrelated perspectives: data, relationship, operating system and application architecture
data structure is the design and implementation of information organization; it is the intermediate step of work between conceptual database design and the practical implementation of file structures
the identification of entities, attributes and relationships for data structure can be carried out either by data modeling or process modeling

--------------------------------------------------------------------------------

7. Review and Study Questions
The following three lines of figures have been extracted from a computer file:





00713344 5000 7.50 1998 12 31 000999999999999
23112410 0500 7.50 1999 11 01 000999999999999
33132211 8000 8.00 2001 06 30 000999999999999

Are these data or information? Explain why.


Explain the importance of information organization and data structure to the functioning of information systems.

Information organization and data structure are key considerations in information system development. Who are the people responsible for identifying, specifying and implementing an organization''s requirements for information organization and data structure?

Explain the differences between geographic data and other types of data from the perspective of information organization and data structure.

List the characteristics of the database approach as opposed to the conventional data file approach to data processing.

Explain the difference between a "data model" and a "database model"

Define "categorical relationship" and "spatial relationship". Explain why spatial relationships are more difficult than categorical relationships to implement in data structure.

Information systems are now mostly based on the client/server architecture. Explain the impact of this particular architecture on information organization in system implementation.

What is a relation in the context of data structure? List the characteristics of a relation in terms of data structure.

What is an object in the context of data structure? How is the data structure for an object-oriented database schema constructed?

Explain the relationships among conceptual, logical and physical modeling in database design.

What is a data flow diagram? Explain how a data flow diagram can be used in connection with information organization and data structure.

--------------------------------------------------------------------------------

8. References
Date, C.J. (1995) An Introduction to Database Systems (6th ed.) Addison-Wesley, Reading, MA.
Elmasri, R. and Navathe, S.B. (1994) Fundamentals of Database Systems. Addison-Wesley, Menlo Park, CA.

Everst, G.C. (1986) Data Management: Objectives, System Functions and Administration, McGraw-Hill, New York.

Goodchild, M.F. (1992) Geographic Data Modeling. Computers and Geosciences. Vol. 18, No. 4, pp. 401-408.

Peuquet, D.J. (1991) Methods for Structuring Digital Cartographic Data in a Personal Computer Environment. In Geographic Information Systems: The Microcomputer and Modern Cartography by Taylor, D.R.F. (ed.), Pergamon Press, Oxford.

Pressman, R.S. (1997) Software Engineering: A Practitioner''s Approach (4th ed.) McGraw-Hill, New York.

Hosted by uCoz