MEASUREMENT THEORY
First Assignment
Due 31 October 2004


The aim of this homework is to show you how to use Epsilon to process an Excel file into a simple text file. Although there are simpler ways of getting an Excel file into a text file oftentimes they do not correctly right/left justify the data columns. This exercise is a repeat of the demonstration I gave on Friday, September 24th using some roll call voting data from Chile. Below are 3 problems all of which I demonstrated in class. What I would like you to do is simply duplicate what I show below and e-mail me the resulting files. Because this is PASS/NO PASS, replicating the results is what I am looking for. The importance of this problem is for you to familarize yourself with Epsilon. Epsilon is an extremely powerful tool. If you plan on working with data during your career I think that you will find that Epsilon gives you capabilities that no other tool has and will enable you to tackle projects that are extremely complex with "canned" packages such as Excel or STATA. Consequently, although the problems look "long" that is due only to the fact that I show screen shots of every step in the process along with brief descriptions of the commands.

One Note of Caution: You can do almost everything below by having this homework up in a browser and then doing the homework inside Epsilon. However, if you go to another Window in the middle of recording a Keyboard Macro it can mess it up! A safe way to proceed is to print out this homework first and then do the assignment. Alternatively, you can make the browser window really small and put it off to one side. In between instructions you can scroll the homework inside the browswer.

Feel free to send me e-mail kpoole@ucsd.edu or KPoole@weber.ucsd.edu if you get stuck and need help.

  1. To begin, download the Excel file

    Chile19982000.XLS -- Chile Roll Call Data

    save it in a directory, and bring it up in Excel.

    Now, bring up Epsilon by double-clicking on the Epsilon icon on your desktop. If you have not used Epsilon before, when it comes up it will look like this:



    If you have used Epsilon before it will bring up the last file that you edited. In any event, what you want to do now is open an empty file in the directory that you stored Chile19982000.XLS in. You can do this by using the usual file menu WINDOWS method or you can do it with the simple keystroke commands:

    C-X C-F -- hold Control Key down and type X (C-X) and hold Control Key down and type F (C-F)

    You should see something like this:



    Note the banner in red at the bottom of the Epsilon Window says Find File: c:\ucsd_homework_1\. Because I started Epsilon in the directory c:\ucsd_homework_1\ that is where Epsilon looks for the file. You can always use the Backspace key and type whatever path statement you want at this point. The C-X C-F is the find file command in Epsilon. The Epsilon Manual is on-line at:

    EPSILON On-Line Manual From Lugaru Software Ltd.

    The specific page for the find file command is:

    EPSILON On-Line Manual Find-File Page

    Now to bring up an empty file just make up any name you like, for example, junk1.txt, type it, and you should see:



    Now hit the Enter key and you should see:



    You are now ready to go (note the banner New file).

    Now shift to the Excel Window and it should look like this:



    Highlight the entire spreadsheet and put it on the clipboard:



    Bring Epsilon up and paste the spreadsheet into the file. You should see something like this:



    Note the blue asterisk to the right of the End -- *. The End means that your cursor is located at the bottom of the file and the * means that you have not saved the file yet (that is, it is in a buffer and not saved to disk yet). To save the file to disk you can use the normal WINDOWS File-Save or the Epsilon key-stroke command:

    C-X C-S -- hold Control Key down and type X (C-X) and hold Control Key down and type S (C-S)

    You will now see



    Note the disappearance of * and the appearance of written. The specific page for the save file command is:

    EPSILON On-Line Manual Save-File Page

    The next step is to change the "99" entries in the file from the Excel spreadsheet into "0" entries. The "99" stands for missing data and the missing data code must be a single digit so we will use "0". To do this we first need to go to the top of the file. To do that we use the:

    ESC < -- hit the Escape Key and then type < . You can also do the same thing by holding the Alt Key down and typing < (Don't forget that you need to hold down the Shift key to get the < key!!!)

    When use hit the ESC key you will see:



    Note the appearance of Alt -- and when you type < you will see:



    and your cursor is now at the top of the file (Note the appearance of Top at the bottom of the window).

    Before replacing the "99"s with "0"s we should remove the first line with the variable names. We really do not need it but it is generally good practice to store it in a separate file in case we need it later. To remove the first line type the command:

    C-K C-K -- hold Control Key down and type K (C-X) and hold Control Key down and type K (C-K) again

    This places the first line in a buffer. The first C-K grabs all the line except for the carriage-return [^J] that is at the end of the line (do the command slowly and see for yourself). The second C-K tacks the carriage-return on the end. You will now see



    The first line is gone and the blue asterisk -- * -- is back (the written stays because the file was written just before this command).

    The next step is to open another empty file to store the first line in case we want to use it for something. To do this we first split the screen with the command

    C-X 2 -- hold Control Key down and type X (C-X) and then type 2

    You will see:



    This command is discussed on this page of the Epsilon on-line manual:

    EPSILON On-Line Manual Creating Windows Page

    Now we need to open another empty file. To do this we use the find-file command:

    C-X C-F

    and type in an unused file name -- in this case junk2.txt:



    Now hit Enter and you see the new file in the bottom window:



    To place the first line into this file, you must yank it out of the buffer. To do this you use the command:

    C-Y

    Now you will see it in the new window:



    The yank command is discussed on this page of the Epsilon on-line manual:

    EPSILON On-Line Manual Killing Text Page

    We do not need the file junk2.txt for the time being so save the file with the Save-File

    command C-X C-S (note that this command saves the window that the cursor is in!). We can now close the window with the command:

    C-X 0 hold Control Key down and type X (C-X) and then type 0 (zero not "oh"!)

    This just says "kill this window but do not delete the file" and is discussed on this page of the Epsilon on-line manual:

    EPSILON On-Line Manual Kill Window Page

    We are now back to a single window showing junk1.txt.



    To replace the "99"s with "0"s we use the replace-string command (from now on I will simply insert the links to the on-line Epsilon manual). The command is:

    ESC & -- hit the Escape Key and then type & (ampersand -- Don't forget that you need to hold down the Shift to get it!!) Also, you can do the same command by holding the Alt Key down and typing &.

    You should see this:



    Note the Replace string: at the bottom of the window. Type 99



    and hit the Enter Key and you will see with: at the bottom of the window. Epsilon remembers the last string that you inserted using the replace-string command and it will show it highlighted in Blue. If this happens just hit the Backspace Key and type the string to be inserted. In this instance, we have not replaced any strings so nothing is shown. Now type 0 (zero) and you should see this:



    Hit the Enter Key and you will see



    and you have replaced all the "99"s with "0"s. Note that Epsilon tells you how many replacements it made from the point where the cursor is to the end of the file.

    At this point it would be a good idea to save the file. The reason is that if you make a mistake at some point you can exit Epsilon and then start over with the last saved version of the file. For example, if you tried to exit Epsilon using the exit command:

    C-X C-C hold Control Key down and type X (C-X) and hold Control Key down and type C (C-C)

    You would see:



    If you then clicked the "Exit" button only the last saved version of junk1.txt would be on disk.

    Because Excel separates all the cell entries with tabs we need to get these out of the file. To do this we will use a simple "trick" -- namely, we will replace all the tabs -- technically these are C-I's or ^I -- with a text string -- "     ucsd     ". Note the spaces before and after ucsd. This makes life really easy as I will demonstrate below.

    Use the replace-string command as described above only now simply hit the Tab Key and you will see



    Now hit the Enter Key and you will see



    Note the Blue highlighted 0 (zero). Hit Backspace and type in "     ucsd     " and you will see



    Hit the Enter Key and you will see



    The tabs are now gone so that the entire file is a text file. The usefulness of the "     ucsd     " can now be seen by eyeballing the file and comparing it to the original Excel file; namely, starting from the beginning of each line there are exactly 3 "ucsd"s before the first roll call vote. Consequently, we can write a simple macro to search over for 3 "ucsd"s and stop, put the rest of the line in a buffer, move to another empty file, yank the line out of the buffer, and return to junk1.txt. Executing this macro for each line produces a matrix of the roll calls plus the "     ucsd     " strings which we can easily remove.

    To begin, save junk1.txt in case we make a mistake with the macro, split the window using the C-X 2 command and then use the find-file C-X C-F command to bring up another emply file -- call it junk3.txt. You should now see:



    Note that your cursor is at the top of the lower window in the upper left corner of the empty file. We need the cursor in the upper window to start writing the macro (actually we could start the macro in the lower window but it is less elegant!). To toggle between windows, use the move-to-window commands or the selecting-windows commands. The easiest forms of this are:

    C-X P

    which moves the cursor up to the next window; and

    C-X N

    which moves the cursor down to the next window. You can also use the arrow keys

    C-X [UP]

    which moves the cursor up to the next window; and

    C-X [DOWN]

    which moves the cursor down to the next window. [UP] and [DOWN] are the corresponding arrow keys. If you had split the screen into two side-by-side windows using the command:

    C-X 5

    then you could move back and forth using either the C-X P and C-X N

    commands or C-X [RIGHT] and C-X [LEFT] commands where [RIGHT] and [LEFT] are the corresponding arrows.

    In any event, move the cursor to the upper window. To begin the macro, use the start-keyboard-macro command:

    C-X (

    where "(" is the left parenthesis. You will see



    Note the Remembering. This means that until you issue the end-keyboard-macro command:

    C-X )

    where ")" is the right ("close") parenthesis, Epsilon records all the keystrokes that you make. You literally are writing a program using the keyboard.

    To begin, we search forward until we find 3 "     ucsd     "s. To do this we use the incremental-search command which is simply C-S. You will see the banner I-Search. You now type in the string that you are searching for and note that Epsilon starts searching forward with every keystroke that you make. (Warning: If you go to another Window in the middle of the C-S command -- for example, you are reading these instructions and trying to do the assignment in another Window -- when you go back you will lose the highlighting! That is, moving to another Window when you are doing some Epsilon commands while writing a macro may mess up the macro!) Type in "    ucsd    " (3 spaces then ucsd then 3 spaces) and you will see



    Now simply type C-S two more times and you should see



    Now hit the [LEFT] arrow key. This stops the searching and moves the cursor one column to the left; or one space to the left of the beginning of the roll call votes. It should look like this:



    Now put the line to the right of the cursor in a buffer by using C-K, go to the beginning of the line with the command C-A, move down one line with the command C-N, and you should see this:



    Note that this positions the cursor in the upper window so the sequence of searching and grabbing the roll calls on a line can be repeated.

    Now move down to the lower window with C-X [DOWN] or C-X N, yank the line from the buffer with the command C-Y, and type Enter. Things should look like this:



    To finish the macro move the cursor back to the upper window with C-X [UP] or C-X P (note that cursor is at the beginning of the second line), and close the macro with the command C-X ). It should look like this:



    Note the Keyboard macro defined banner at the bottom of the screen.

    It is now a matter of executing the macro for the remaining lines in the file. To figure out how many lines there in a file use the count-lines command, C-X L. It literally counts the number of lines in the file (technically, it is counting the number of carriage-returns [^J] in the file. You should see this



    Note that the banner at the bottom tells you the number of lines, what line you are on, and the size of the file. I have found it to be good practice to NEVER execute a macro all the way to the end of a file simply because there may be some blank lines at the bottom and sometimes that can cause you a great deal of grief. To be safe, let's execute the command 117 times. To repeat a command type C-U and you should see:



    Note that the banner shows that the default number of arguments is 4. Now simply type 117 from the keyboard and it overwrites the "4". It will look like this:



    What Epsilon will now do is repeat 117 times any command that you now enter from the keyboard. The command to execute a macro is C-X E. When you type C-X you should see:



    Now hit the letter E. The macro will now run 117 times and the result should be



    We only have 3 lines left so you can either type C-X E C-X E C-X E or C-U, hit 3 and then type C-X E. Either way you end up with this:



    Note that your cursor is in the upper window at the End of the file and neither file has been saved. Save junk1.txt and close the window; namely use the commands C-X C-S and then C-X 0. Now save junk3.txt and we should now be here:



    All we have left to do is remove the "    ucsd    "'s and we have a nice neat roll call matrix. To do this we need to go to the beginning of the file. The command to do this is ESC < (hit the Escape Key and then type < (less than sign)) or Alt-< (hold Alt Key down and type <). This will put you at the top of the file:



    Use the replace-string command as described above only now search for "     ucsd     " (3 spaces ucsd 3 spaces) and replace it with no spaces. Here is what it looks like in sequence. First the ESC & followed by typing in the "     ucsd     ":



    Next, after hitting Enter, Epsilon remembers the last string we used to replace:



    Now simply hit Backspace



    And hit Enter again



    Save Junk3.txt and we are done.

    It is usually good practice to name the file so that you will remember what it is (more important for an oldster like me!!). This is really easy from Epsilon using the Write File command:

    C-X C-W

    Issue the command and you will see this:



    Now simply type your favorite file name from the keyboard (it is a good idea NOT to put spaces in the file name -- not I use underscore "_" to connect words, this is a VERY SAFE way of doing things!):



    And hit Enter



    and you have the file. This does not delete Junk3.txt! It simply writes a copy to new file!

    E-Mail me your file for a PASS on this part of the problem.

  2. In this problem I will show you how to run the keyboard macro described above as a command file. This approach is very handy for complex or large macros. For example, suppose you want to put a large matrix of roll calls into STATA. This is easy to do using Epsilon but it involves running a macro that inserts commas between each roll call choice. If you tried to do this using a Keyboard Macro on a file with 50 or more roll calls you would soon discover that it simply is not practical! However, it is easy to do by running a macro within a macro but to do nested macros you need to know how to run simple command files.

    Download these two files

    Homework-Part-1.TXT -- Keyboard Macro as Text File

    Junk11.TXT -- Chile Roll Call Data with the ucsd's in it

    And bring them up in Epsilon in a split screen so it looks like this:



    You can adjust the size of the windows by using C-PageUp or C-PageDown -- that is, hold down the Ctrl key and hit PageUp or PageDown and it changes the size of the window that the cursor is in. If you do C-PageDown five times it will shrink the lower window down so you see more of the upper window and it should look something like this:



    Note that Junk11.TXT is the file Junk1.TXT from part 1 above before we ran the keyboard macro to get the roll calls into junk3.txt. Before we start the macro we need an empty file so place the cursor in the window with junk11.txt (recall that you can use the mouse or the C-X N or C-X P commands to move between windows), split the window with C-X 2, and get an empty file called junk33.txt with the command C-X C-F as described in part 1. You now should have something like this:



    Move the cursor up to junk11.txt using C-X P. Now we are ready to run the macro. We will run it first and then I will explain the logic of it afterwards. To run the macro type

    Alt-X Hold Alt Key down and type X

    and you will get:



    Note the Command: prompt at the bottom of the screen. You can now type any Epsilon Command at this prompt. Type load-buffer and hit the Enter Key and you should see:



    Epsilon thinks that you are referring to the buffer that you have the cursor in. Just hit Backspace and type homework-part-1.txt so it looks like this:



    Now simply hit the Enter Key and you should get this:



    What Epsilon has done is compile the macro and found no errors. What this means is that we can now run it. Note that in the macro file after the define-macro there is a name in quotes -- homework-part-1. That is the name that Epsilon has assigned to the macro. The macro itself is the block of text between the quotes (I use "[" rather than "<" below to avoid confusing HTML!!!!):

    C-U3 [!string-search]    ucsd   C-MC-BC-KC-AC-NC-XnC-YC-MC-Xp
    
    C-U3 repeats the string-search command 3 times.
    [!string-search]    ucsd   C-M is the search command -- C-M is a carriage-return
    C-B backs the cursor up one character
    C-K places the string to the right of the cursor in the kill buffer
    C-A places the cursor at the beginning of the line
    C-N moves the cursor down one line
    C-Xn moves the cursor down to the next window
    C-Y yanks the string from the kill buffer
    C-M is equivalent to hitting Enter (a carriage-return)
    C-Xp moves the cursor up one window
    
    To run it, use the command Alt-X to bring up the Command: banner and then type homework-part-1:



    Now type Enter and the macro executes one time. This is equivalent to doing C-X E as described in part 1. This produces:



    You can run the command multiple times by using the C-U command as described in part 1. Type C-U then 117 then Alt-X to bring up the Command: banner and then type homework-part-1. You should get:



    Now type Enter and the macro executes 117 times producting:



    To finish repeat the above except use "3" instead of "117" and you have the same file as part 1. Type C-U then 3 then Alt-X to bring up the Command: banner, type homework-part-1 and type Enter and the macro executes 3 times:



    Process Junk33.txt in the same way that you processed Junk3.txt to get a roll call file as you did in part 1.

    E-Mail me your file for a PASS on this part of the problem.



  3. In this problem we will process the header file -- Junk1.TXT -- that we saved above. To begin, we need to bring up Junk1.TXT in Epsilon. We can do this with either the find-file ( C-X C-F) command or, if you are forgetful about the directory that it is in (as I often am!), you can use the show-buffers command, C-X C-B, to show all the files that Epsilon has recently worked with. You should see:



    Simply use the [UP] and [DOWN] arrow keys to move up and down the file list. Note that as you do so they appear behind the menu box. To select a file just click "OK".

    A word of warning! Epsilon has a limit on the size of the buffers it retains (this is to increase the speed that it comes up when you invoke it). This is easy to adjust but let's not do that now. (If you want to know how, send me an e-mail and I will tell you how to do it.) Suffice to say, you can always use the C-X C-F command to bring up the file.

    Bring up junk1.txt, split the window with C-X 2, and get an empty file called names.txt with the command C-X C-F as described in part 1. You now should have something like this:



    You can now see the logic of using the "     ucsd     " string. In any nicely organized file we would want the names to be left justified. Note that the first "ucsd" is always 3 spaces from the beginning of the name! Hence, to get a left justified name all we need to do is write a simple keyboard macro to search over to the first "ucsd", place the string between the first and second "ucsd"'s in a buffer, move to the names.txt window, yank it from the buffer, move back up the top window, etc.

    The ID and party codes are integers that should be right justified. We can do that for the ID code by inserting a bunch of spaces at the beginning of each line of junk1.txt by simply using the Space-Bar, searching forward to the first "ucsd", backing up from the ucsd to get the proper width for the numbers so that they are right justified, putting the string in a buffer, moving to another window, yanking the buffer, and moving back to the original window.

    I will save the more complicated integer right justification macros for the second homework! For now, lets do the names. To begin, put the cursor in the upper window, and start the macro with the start-keyboard-macro command C-X ( where "(" is the left parenthesis. The first step is to search forward until we find "     ucsd     ". To do this we use the incremental-search command which is simply C-S. You will see the banner I-Search. You now type in the string that you are searching for and note that Epsilon starts searching forward with every keystroke that you make. (Warning: If you go to another Window in the middle of the C-S command -- for example, you are reading these instructions and trying to do the assignment in another Window -- when you go back you will lose the highlighting! That is, moving to another Window when you are doing some Epsilon commands while writing a macro may mess up the macro!) Type in "    ucsd    " (3 spaces then ucsd then 3 spaces) and you will see



    Now hit the [LEFT] arrow key. This stops the searching and moves the cursor one column to the left; or one space to the left of the beginning of the names. It should look like this:



    Now put the line to the right of the cursor in a buffer by using C-K, yank it back out using C-Y (note that when you do this the cursor will be at the end of the line -- we want junk1.txt to be intact so we can use it later!), go to the beginning of the line with C-A, move down one line with C-N (note that this positions the cursor at the beginning of the second line with junk1.txt still intact!!), move down to the names.txt window with C-X [DOWN] or C-X N, yank the line from the buffer with the command C-Y, then go to the beginning of the line with C-A. You should be here:



    Now, the obvious thing to do is to search forward to the "ucsd" and kill the line from the immediate left of "ucsd". However, note that not all the names will be the same length! We want the name string to be the same length on every line. To do this we will search forward to "ucsd", back up one space to the left of the "u" in "ucsd", hit the Space-Bar 30 times to be safe, go back to the beginning of the line, move forward 35 spaces, kill the rest of the string, hit Enter to position ourselves on a new line in names.txt, and finish the macro by going up to the top window.

    First, search forward for "ucsd" -- no spaces!! -- using C-S and typing in "ucsd". It will highlight ucsd in blue as described above. Now hit the [LEFT] arrow key 5 times. The blue highlighting disguises the fact that the cursor is located just to the right of the "d" in "ucsd". By backing up 5 times you are now one space in front of "ucsd". It should look like this:



    Now just hit Space-Bar 30 times and it should look like this:



    Kill the rest of the line with C-K and move back to the beginning of the line with C-A.

    Now you can see the reason why we added so many blank spaces after the name. What we are going to do now is move forward 35 spaces and kill whatever is left in the line. Note that this makes every line exactly 35 characters in length. If you wanted to allow for even longer names all you have to do is add more Space-Bar's in the step above.

    To move forward you can either hit the [RIGHT] arrow key 35 times or use C-U35 [RIGHT]. The [RIGHT] arrow key is the same as the keyboard command C-F -- move cursor forward one character. So the equivalent command would be to do C-U35 C-F. The two are exactly the same. Kill the remaining chacters in the string with C-K and you should now be here:



    Now hit Enter to position the cursor at the beginning of the next line. To finish the macro move the cursor back to the upper window with C-X [UP] or C-X P (note that cursor is at the beginning of the second line in junk1.txt), and close the macro with the command C-X ). It should look like this:



    Note the Keyboard macro defined banner at the bottom of the screen.

    Run the macro 117 times as described in part 1 above using C-U 117 and then C-X E, and you should see:



    We know that we are near the bottom of the file because Epsilon tells us what percentage of the file is above the cursor! Note the 97% in the name line of the window. Without moving the cursor we can move the file up and down in the window by using the keyboard commands:

    C-Z -- move file up one line in the window, or

    Alt-Z -- move file down one line in the window.

    Do C-Z 3 times and you will get



    Note how handy this is! You can see where you are in the file without moving the cursor. We can see that we need to run the macro 3 more times and we are done:

    C-U 3 and then C-X E.



    Save both files with C-X C-S (we will need junk1.txt for the second homework).

    E-Mail me your names.txt file for a PASS on this part of the problem.