2.S01 What is a Text File?

What is a text file? Consider files like powerpoint files, PDFs or word documents. These files require the corresponding programs, like powerpoint, a PDF reader, or Microsoft Word to open and change the files. And these files have corresponding file suffixes like .pptx, .pdf, or .docx to indicate that the file is such a format.

If you have a file, e.g., filename.pdf, if you want you can look at the contents of the PDF file with:

 $ cat filename.pdf

It might look something like:

 J=?:??v,m*=6?>???qN?.J?g?g???[????
 ????^[=&=]??g5gS?¨Sj!?$?????<f??j+D+:WZ??en@?/???????}???_/}???M??????????I?S??-
 {?|/??,?+*-???????????<?-y[?v???w`w?v????y?X?8?xp]????Q?Tz`q?tOYhY?^??;?~-?

A plain text file may look something like:

 were Class A boats of conventional design - open boats hung over the
 deck from cranelike arms, or davits, strung with block and
 tackle. The smallest of these boats could seat fifty-one people; the
 largest, sixty-nine. In an emergency, the boats were to be swung out
 over the sea and lowered to the deck rails so that passengers could
 climb in. Once the boats were filled, two crewmen would manage the
 ropes.

The above text seems "plain" enough. There doesn't seem to be any format other than the sentence prose of the author. Any number of editor programs will allow you to open, edit, and save this file. For examle, emacs, vim, textedit, gedit, and so on. One or more such programs may already be on your computer.

In between the two above extremes, there are files that have a certain format, with a notion of accepted and not accepted structure. The file is more or less "human readable" if you can squint past the syntax markers:

 <!DOCTYPE html>
 <html>
 <title>HTML Tutorial</title>
 <body>

 <h1>This is a heading</h1>
 <p>This is a paragraph.</p>

 </body>
 </html>

The above file is an .html file. HTML stands for Hyper Text Markup Language. HTML is the standard markup language for creating Web pages. With the extra formatting, a web browser knows the content and how to render the style of the content. And there exist WSYSIWIG (what you see is what you get, pronounced wiz-ee-wig) tools for editing the HTML file and seeing the rendering alonside the file as the file is being edited:

Figure 1.1: A WYSIWIG tool for editing an htlm file and showing the rendering change as the edits are applied.

In the above example, you could imagine that if you are spending a lot of time composing a thoughtful web page, working with a specialized editor tool that shows the rendering as you type would be the way to go. No argument there. However, you could also imagine that, if that file exists, and you only want to change "This is a Heading" to "This is My Heading", the file is human readable enough that using a simple text editor would suffice. If the file were large, you could open the file with a text editor, use the find or search tool that all text editors have, go right to that sentence and make the simple modification, and save the file.

Writing C++ is not much different. The code has syntax, just like the html example, and there exist Integrated Development Environments (IDEs) that allow you to edit the C++ file, with lots of features to present the overall coding context as you edit the files. A common IDE as of this writing is VSCode. However, a C++ file is still also a plain-text file that is human readable, and you can (and many people do) compose large C++ projects in text editors like emacs and vim. The latter is largely made possible by the fact that high power text editors like emacs and vim, also have modes that can detect when you are editing a C++ file and provide syntax highlighting and other features that bring some of the contextual support of an IDE.

In short, with text files, you have the option always of using a text editor like emacs or vim. And for some text files, with certain formats, there exist specialized editors tools that can simplify the editing with greater visual feedback as you edit. Mastering at least one text editor is an essential skill. There will be times when a file needs to be edited and the GUI based tool is not available. Especially when editing a file on a machine through a remote login.


Document Maintained by: mikerb@mit.edu        
Page built from LaTeX source using texwiki, developed at MIT. Errata to issues@moos-ivp.org. Get PDF