Data Structure 

f2

Advanced Organizer
Topics covered in this unit
Intended learning outcomes
Instructors'' Notes
Metadata and Revision History
1. Definitions and Terminology

1.1 Data and information
1.2 Geographic data and geographic information
1.3 The information domain
1.4 the data-oriented approach to information systems
2. Information Organization

2.1 The data perspective of information organization
2.1.1 Informaiton organization and descriptive data
2.1.2 Information organization and graphical data
2.2 The relationship perspective of information organization
2.2.1 Categorical relationships
2.2.2 Spatial relationships
2.3 The operating system perspective of information organization
2.4. The application architecture perspective of information organization
3. Data structure

3.1 Levels of data abstraction
3.2 Descriptive data structures
3.2.1 Relational data structure
3.2.2 Object-oriented data structure
3.3 Graphical data structures
3.3.1 Raster data structure
3.3.2 Vector data structure
3.4 The georelational data structure
4. Data modeling

4.1 Conceptual data modeling
4.2 Logical data modeling
4.3 Physical data modeling
5. Process modeling


6. Summary
. Definitions and Terminology
1.1. Data and information
many people use the terms "data" and "information" as synonyms but these two terms actually convey very distinct concepts
"data" is defined as a body of facts or figures, which have been gathered systematically for one or more specific purposes
data can exist in the forms of
linguistic expressions (e.g. name, age, address, date, ownership)
symbolic expressions (e.g. traffic signs)
mathematical expressions (e.g. E = mc2)
signals (e.g. electromagnetic waves)
"information" is defined as data which have been processed into a form that is meaningful to a recipient and is of perceived value in current or prospective decision making
although data are ingredients of information, not all data make useful information
data not properly collected and organized are a burden rather than an asset to an information user
data that make useful information for one person may not be useful to another person
information is only useful to its recipients when it is
relevant (to its intended purposes and with appropriate level of required detail)
reliable, accurate and verifiable (by independent means)
up-to-date and timely (depending on purposes)
complete (in terms of attribute, spatial and temporal coverage)
intelligible (i.e. comprehensible by its recipients)
consistent (with other sources of information)
convenient/easy to handle and adequately protected
the function of an information system is to change "data" into "information", using the following processes (Figure 1):
conversion --- transforming data from one format to another, from one unit of measurement to another, and/or from one feature classification to another
organization --- organizing or re-organizing data according to database management rules and procedures so that they can be accessed cost-effectively
structuring --- formatting or re-formatting data so that they can be acceptable to a particular software application or information system
modeling --- including statistical analysis and visualization of data that will improve user''s knowledge base and intelligence in decision making
the concepts of "organization" and "structure" are crucial to the functioning of information systems --- without organization and structure it is simply impossible to turn data into information
1.2. Geographic data and geographic information
geographic data are a special type of data; by "geographic", it means that
the data are pertinent to features and resources of the Earth, as well as the human activities based on or associated with these features and resources
the data are collected and used for problem solving and decision making associated with geography, i.e. location, distribution and spatial relationships within a particular geographical framework
geographic data are different from other types of data in that
they are geographically referenced, i.e. they can be identified and located by coordinates
they are made up of a descriptive element (which tells what they are) and a graphical element (which tells what they look like, where they are found and how they are spatially related to one another)
the descriptive element is also commonly referred to as non-spatial data
the graphical element is also commonly referred to as spatial data
geographic information is obtained by processing geographic data, the aim of which is to
improve the user''s knowledge about the geography of the Earth''s features and resources, as well as human activities associated with these features and resources
enable the user''s to develop spatial intelligence for problem solving and decision making concerning the occurrence, utilization and conservation of the Earth''s features and resources, as well as the impacts and consequences of human activities associated with them
because of the special nature and characteristics of geographic data, generic concepts of information organization and data structure cannot be applied directly to them
this unit attempts to explain the principles and methods of information organization and data structure with special reference to geographic data, particularly with respect to:
the organization and structure of descriptive geographic data
the organization and structure of graphical geographic data
the relationship and linkage between the descriptive and graphical elements of geographic data
1.3. The Information Domain
an information system is designed to process data, i.e. to accept input (data), manipulate it in some way, and produce output (information) (Figure 1)
it is also designed to process events --- an event represents a problem or system control which triggers data processing procedures in an information system
the information domain of an information system therefore includes both data (i.e. characters, numbers, images and sound) and events (i.e. problem and control)
there are three different components of the information domain
information organization (also referred to as information structure) --- the internal organization of various data and event items (see Section 2 below)
the design and implementation of information organization is referred to as data structure (see Section 3 below)
information contents and relationships --- the attributes relating to the data and the events, and the relationships with one another
the process of identifying information contents and relationships is known as data modeling in information system design (see Section 4 below)
information flow --- the ways by which data and events change as they are processed by the information system
the process of identifying information flow is known as process modeling in information system design (see Section 5 below)
the above views of information domain provides the conceptual framework that links database management and application development in information systems
it signifies that information organization and data structure are not only important for the management of data, but also for the development of software applications that utilize these data
1.4. The data-oriented approach to information systems
an information system is perceived as being made up of four components: data, technology, process (or application) and people
the traditional approach to information system development was either technology-oriented or process-oriented
technology-oriented approach --- based on availability and/or functionality of hardware and software
process-oriented approach --- based on the desire to automate a particular business process
the current approach to information system development is data-oriented or data-driven
information systems are designed and developed to process, manage and analyze data in support of the business objectives of an organization
of the four components of information systems, data are most stable
computer technology is evolving very rapidly
consider the advances in computer hardware and software in the last several years
there is always a risk of using newest technologies which have not been fully market-tested
processes may change due to changing business objectives, customer requirements, modes of service delivery and available tools and technology
consider the changes in bank transactions (depositing and withdrawing money) that have occurred in the last several years
process-oriented applications have relatively short life span and frequent re-development is very costly
particular business functions always require the same data for operation and decision-making purposes despite changes in technology and process
for example, bank transactions use the same data no matter how they are done (over the counter or using an automated telling machine)
of the four components of information systems, data are most expensive to acquire
in many projects, the collection of data accounts for half or more of the capital investment
it is natural to make use of the most stable and expensive component to drive information system design and development in order to maximize the return from capital investment
a data-oriented approach to information system is characterized by
managing data as a valuable corporate resource in the same way financial, technical and human resources are managed
this is the basis of the concept of information resource management (IRM)
sharing data among different users or user groups
this helps maximize the cost-benefit ratio of the capital investments on information systems
data-centric strategy in the acquisition of hardware and software
specifications of hardware and software must be able to meet data requirements, but not to change data requirements in order to suit the characteristics or functionality of hardware and software
data-driven application development
software applications are designed to enable the effective and efficient use of data in business operation and decision making
the use of information systems is always used by organizations as an opportunity to re-engineer business, i.e. to change the philosophy and ways of running a business
data orientation does not mean that every user is involved in information organization and data structure
ensuring that information organization and data structure meet the business needs of an organization is the responsibility of a small team of technical staff under the leadership of a database administrator
the database administration team defines an organization''s information organization by carrying out detailed user requirement studies
representatives of end users assist in defining information organization by taking part in user requirement studies
system developers design and build software applications without the need to worry about information organization and data structure, i.e. they develop applications on the basis of existing or accepted data structures
however, system developers may also assist in defining information organization by taking part in user requirement studies
information organization and data structure are transparent to end users, i.e. they can use software applications without the need to know anything about data structure
however, this does not imply that information organization and data structure are trivial considerations in information system
information organization and data structure reflect the users'' requirements
many projects fail because of the lack of understanding of information organization and data structure, but not because of the lack of capability of technology
identification of users'' requirements, which forms the foundation of good design and correct specification of information organization and data structure, is always the most important step in information system development
the ultimate goal of information organization and data structure is to create the necessary technical environment that allows the development of information systems which are
cost-effective to implement --- by the ability to use shared data and possibly applications by users in different organizations
flexible to build --- by permitting the addition or removal of applications, in response to changing needs and objectives of the information users, without affecting existing data structure
easy to use --- by eliminating the need of the regular users to worry about the structure of the data

--------------------------------------------------------------------------------

2. Information Organization
information organization can be understood from four perspectives:
a data perspective
a relationship perspective
an operating system (OS) perspective
an application architecture perspective
2.1. The data perspective of information organization
the information organization of geographic data must be considered in terms of their descriptive elements and graphical elements because
these two types of data elements have distinctly different characteristics
the have different storage requirements
they have different processing requirements
2.1.1. Information organization of descriptive data
for descriptive data, the most basic element of information organization is called a data item (Figure 2a)
a data item represents an occurrence or instance of a particular characteristic pertaining to an entity (which can be a person, thing, event or phenomenon)
it is the smallest unit of stored data in a database, commonly referred to as an attribute
in database terminology, an attribute is also referred to as a stored field
the value of an attribute can be in the form of a number (integer or floating-point), a character string, a date or a logical expression (e.g. T for ''true'' or ''present"; F for ''false'' or ''absent'')
some attributes have a definite set of values known as permissible values or domain of values (e.g. age of people from 1 to 150; the categories in a land use classification scheme; and the academic departments in a university)
a group of related data items form a record (Figure 2b)
by related data items, it means that the items are occurrences of different characteristics pertaining to the same person, thing, event or phenomenon (e.g. in a forest resource inventory, a record may contain related data items such as stand identification number, dominant tree species, average height and average breast height diameter)
a record may contain a combination of data items having different types of values (e.g. in the above example, a record has two character strings representing the stand identification number and domi

Information Organization andData Structure
Next
Next
Next

Hosted by uCoz