CS248 Menu Buttons

UNIX Programming

"Chapter Seven - Data Management"

Chapter Outline

Lecture Notes

Data Management

In This chapter, we're going to look initially at ways of managing your resource allocation, then of dealing with files that are accessed by many users more or less simultaneously and lastly at one tool provided in most UNIX systems for overcoming the limitations of data files.

We can summarize these topics as three ways of managing data:

Managing Memory

On all computer systems memory is a scarce resource.

UNIX applications are never permitted to access physical memory directly.

UNIX provides memory protection which guarantees that incorrect (or malicious) programs don't overwrite the memory of other processes or the operating system.

In general, the memory allocated to one process can be neither read nor written to by any other process.

Almost all versions of UNIX use hardware facilities to enforce this private use of memory

Simple Memory Allocation

We allocate memory using the malloc call in the standard C library.

Try It Out - Simple Memory Allocation

We can allocate a great deal of memory on most UNIX systems. Let's try.

Type in the following program, memory1.c

When we run this program, it outputs:

How It Works

The program asks malloc for a pointer ot one megabyte of memory. It checks that the memory was assigned and uses the memory to hold the words "Hello World", and then prints out the words.

Allocating Lots of Memory

Now we will ask the machine for all of its memory!

Try It Out - Asking for all Physical Memory

With memory2.c, we're going to ask for all the machine's memory:


The output, somewhat abbreviated, is:

How It Works

The program loops, asking for more and more memory.

If we time how long it took to run using the command,

on this system, we see an elapsed time of just 0.54 seconds.

Try It Out - Available Memory

Let's see just how much memory we can allocate on this machine with memory3.c. It is extremely system-unfriendly and could affect a multiuser machine quite seriously.


This time, the output, again abbreviated, is,

and then the program ends.

How It Works

The UNIX kernel allocates memory upon request. When it runs out of physical memory, it uses swap space. When the maximum stack size is exceeded, the Unix kernel finally refuses the request for further memory.

Abusing Memory

Suppose we try and do 'bad' things with memory.

Try It Out - Abuse Your Memory

Let's allocate some memory and then attempt to write past the end, in memory4.c.

The output is simply:

How It Works

The UNIX memory management system has protected the rest of the system from this abuse of memory.

The Null Pointer

UNIX systems are very protective about writing or reading from a null pointer.

Try It Out - Accessing a Null Pointer

Let's find out what happens when we access a null pointer in memory5.c.

The output is:

How It Works

The first printf attempts to print out a string obtained from a null pointer, then the sprintf attempts to write to a null pointer. The program terminates.

Freeing Memory

We have been using memory and hoping that UNIX would recover it. It can when the progrm terminates.

Programs that use memory on a dynamic basis should always release the unused memory by to the malloc memory manager using the free call.

A call to free should only be made with a pointer to memory allocated by a call to malloc.

Try It Out - Freeing Memory

This program's called memory6.c:

How It Works

This program simply shows how to call free with a pointer to some previously alocated memory.

Other Memory Allocation Functions

There are tow otherr memory allocation functions that are not used as often as malloc and free. These are calloc and realloc. The prototypes are:

Another problem is that realloc returns a null pointer if it was unable to resize the memory. This means that, in some applications, should avoid writing code like this:

File Locking

File locking is used on a multiuser , multitasking system, to establish control of a file so it can be updated in a safe fashion.

Creating Lock Files

Linux creates a lock file in the /usr/spool/uucp directory.

This directory may also be used to indicate the availability of serial ports:

Lock files act only as indicators. They are advisory, not mandatory.

To create a file to use as a lock indicator, we use the open system call definied in fcntl.h, with the O_CREATE and O_EXCL flags set.

Try It Out - Creating a Lock File

Let's see this in action with lock1.c

The first time we run the program, the output is,

but the next time we try, we get:

How It Works

The program calls open to create a file called /tmp/LCK.test, using the O_CREAT and O_EXCL flags. The first time it works, the second time it fails.

Error 17 refers to EEXIST, an error that is used to indicate that a file already exists.

Error numbers are defined inthe header file errno.h or files included by it. In this case, the definition reads:

Try It Out - Cooperative Lock Files

1. Here's the source of our test program, lock2.c:

2. The critical section starts here...

..and ends here:

To run it, we should first ensure that the lock file doesn't exist, with

thenrun two copies of the program, by using the command: This is the output that we get:

How It Works

The program loops ten times trying to access the critical resource by creating a unique lock file.

Locking Regions

We can create locking regons of the file, so that a particular section of hte file is locked, but other programs may access other parts of the file.

It can be done using calls to either fcntl and lockf.

The definiton of fcntl is:

fcntl operates on open file descriptors. The three parameters we are interested in for file locking are:

When we use these, the third argument must be a pointer to a struct flock, so the prototype is effectively:

The flock (file lock) structure is implementation-dependent, but will contain at least the following members:

The l_type member takes one of server values, also defined in fcntl.h Theres are:

The F_GETLK Command

The F_GETLK gets locking information about the file that fildes (the first parameter) has open. fcntl used with the F_GETLK command returns any information that would prevent the lock occurring.

The values used in the flock structure are:

The F_SETLK Command

This command attempts to lock or unlock part of the file referenced by fildes. The values used in the flock structure (and different from those used by F_GETLK) are:

The F_SETLKW Command

This is the same as the F_SETLK command above, except that if it can't obtain the lock, the call will wait until it can.

Use of read and write with Locking

When using locking on regions of a file, it's important to use the lower level read and write calls to access the data in the file. They don't use buffering of data.

Try It Out - Locking a File with fcntl

To try out locking, we need two program, one to do the locking and one to test.

The first program does the locking.

1. Start with the includes and variable declarations:

2. Open a file descriptor:

3. Put some data in the file:

4. Set up region 1 with a shared lock, from bytes 10 to 30:

5. Set up region 2 with an exclusive lock, from bytes 40 to 50:

6. Now lock the file...

7. ...and wait for a while.

How It Works

The program creates a file, opens it for reading and writing, and fills it with data. It sets up a shared read lock on bytes 10 to 30, and an exclusive write lock on bytes 40 to 50. Then it locks the two regions.

When the program starts to wait, this is the situation with locks:

Try It Out - Testing Locks on a File

The second program for testing the locks is lock4.c.

1. As usual, begin with the includes and declarations:

2. Open a file descriptor:

3. Set up the region we wish to test:

4. Now test the lock on the file:


5. Now repeat the test with a shared (read) lock. Set up the region we wish to test again:

6. Test the lock on the file again:

To test out locking, we first need to run the lock3 program, then the lock4 program.

We do this by executing the lock3 program in the background, with the command:

The command prompt returns, since lock3 is running in the background and we then immediately run the lock4 program with the command: The output we get, with some omissions for brevity, is:

How It Works

For each group of five bytes in the file, lock4 sets up region structure to test for locks on the file.

Competing Locks

what happens when two programs compete for locks on the same section of the file?

We use lock3 to lock the file and a new program, lock5.c to try to lock it again.

Try It Out - Competing Locks

1. After the #include and declarations, open a file descriptor:

<2. The remainder of the program is spent specifying different regions of the file and trying different locking operations on them:/P>

If we first run our lock3 program in the background, then immediately run the new program, the output we get is:

How It Works

First, the program attempts to lock a region from bytes 10 to 15 with a shared lock.

Then it unlocks its opwn shared lock on the region and attempts to unlock the first 50 bytes of the file.

The program attempts a shared lock on the region from bytes 40 to 50.

Finally the program again attempts to obtain an exclusive lock on the region from bytes 16 to 21.

Other Lock Commands

Another method of locking files uses the lockf function. This also operates using file descriptors. It has the prototype:

It can take the following function values:

The size_to_lock parameter is the number of bytes to operate on, from the current offset in the file.

Deadlocks

A deadlock occurs when two programs can't proceed because each has a lock on something that the other programs needs.

Databases

Databases can store records that vary in size, and store and retrieve data using an index.

The dbm Database

X/Open compliant versions of UNIX come with a database called dbm.

The dbm Routines

Like curses, the dbm facility consists of a header file and a library that must be linked when the program is compiled.

dbm Concepts

The dbm database element is a block of data to store, coupled with a companion block of data that acts as a key for retrieving that data. Each block must have a unique key.

To manipulate these blocks as data, the ndbm.h include file defines a new type, called datum. The exact contents of this type is implementation-dependent, but it must have at least the following members:

dbm Access Functions

The prototypes for the main dbm functions are:

dbm_open

This function is used to open existing databases and can to be used to create new databases.

dbm_store

We use this function for entering data into the database.

dbm_fetch

This routine is used for retrieving data from the database.

dbm_close

This routine clases a database opened with dbm_open.

Try It Out - A Simple dbm Database

1. First of all, its incluude files, #defines, the main function and the declaration of the test_data structure:

2. Within main, we set up the items_to_store and items_received structures, the key string and datum types:

3. Having declared a pointer to a dbm type structure, we now open our test database for reading and writing, creating it if necessary:

4. Now we add some data to the items_to_store structure:


5. For each entry, we need to build a key for future referencing.

6. Then we see if we can retrieve this new data and, finally, we must close the database:

When we compile and run the program, the output is simply:

How It Works

First we open the database and then fill in the test data and keys. Then we setup the two datum structures for the data and keys. Finally we print out the retrieved data.

Additional dbm Functions

Now that we've seen the principle dbm. functions, we can cover the few reamaining functions that are used with dbm

These are:

dbm_delete

The

dbm_delete

function is used to delete entries from the database.

dbm_error

The dbm_error function simply tests whether an error has occurred in the database, returning 0 if there is none.

dbm_clearerr

The dbm_clearerr clears any error condition flag that may be set in the database.

dbm_firstkey and dbm_nextkey

These routines are normally used as a pair to scan through all the keys of all the items in a database.

The loop structure required is:

Try It Out - Retrieving and Deleting

Now amend dbm1.c with some of these new functions. It is called dbm2.c.

1. Make a copy of dbm1.c and open it for editing. Edit the #define TEST_DB_FILE line:

2. Then, the only change that we need to make is in the retrieval section:

The output is:

How It Works

The program simply stores some data in the database, builds a key, and delete the item from a database.

The CD Application

Now er update the CD application.

We will use SQL terminology.

In code the table can be described by:

The CD Application Using dbm

Now we re-implement our applicaton using the dmv database to store the information we need, with files cd_data.h, app_ui.c and cd_access.c.

Try It Out - cd_data.h

1. This is the data structure definition for the CD database.

2. Now we can define some access routines that we'll need.

Functions with cdc_ are for catalog entries; functions with cdt_ are for track entries.


Try It Out - app_ui.c

1. We now move on to the user interface. We start with some header files:.

2. We make our menu options typedefs.

3. Now the prototypes for the local functions.

4. Finally, we get to main.

5. We're now ready to process user input.


6. When the main loop exitts, we close the database and exit back to the environment.

7. Here we implement the show_menu function.



8. We extract a separate function, get_confirm, to ask the user if they are sure about what they are doing.


9. The function, enter_new_cat_entry, allows the user to enter a new catalog entry.


10. We now come to the function for entering the track information.


11. First, we must check whether a track already exisits with the current track number.

12. If there was no existing entry for this track and the user hasn't added one,we assume that there are no more tracks to be added.

13. If the user enters a single d character, this deletes the current and nay higher numbered tracks.

14. Here we get to the code for adding a new track, or updating an existing one.

15. The function del_cat_entry deletes a catalog entry.

16. The next function is a utility for deleting all the tracks for a catalog.

17. Next, we create a very simple catalog search facility.


18. list_tracks is a utility function that prints out all the tracks for a given catalog entry:

19. The count_all_entries function counts all the tracks:

20. Now we have display_cdc, a utility for displaying a catalog entry:

and display_cdt, for displaying a single track entry:

21. The utility function, strip_return, removes a trailing linefeed character from a string.


22. command_mode is a function for parsing the command line arguments.

Try It Out - cd_access.c

1. Now we come to the functions that access the dbm database. Here are the #include and #defines.


2. We use these two file scope variables to keep track of the current database:

3. By default, the database_initialize function opens an existing database.

4. database_close simply closes the database if it was open.

5. Next, we have a function for retrieving a single catalog entry.

6. We start with some sanity checks, to ensure that a database is open.

7. We set up the datum structure.

8. We also need to be able to get a single track entry.


9. The next function, add_cdc_entry, adds a new catalog entry:

10. add_cdt_entry adds a new track entry. The access key is the catalog string and track number acting as a composite.


11. This function deletes catalog entries:

12. Here's tje equivalent function for delteing a track.


13. Last but not least, we have a simple search function.

If the search text points to a null character then all entries are considered to match.

14. As usual, we start with sanity checks:

15. If this function has been called with *first_call_ptr set to true, we need to start searching from the beginning of the database.

16. Our search facility is a very simple check to see whether the search string occurs in the current catalog entry.

We're now in a position to be able to put everything together with this makefile.

To compile your new CD application, type this at the prompt:

If all has gone well, the application executable will be compiled and placed in the current directory.

Summary

In this chapter were three different aspects of data management: memory, file locking, and the dbm library.


CS 248 - UNIX Programming Web Site Menu
Information | Syllabus | Schedule | Online "Lectures" | Projects | Quizzes | Web Board



Copyright © 2001 by James L. Fuller, all rights reserved.