43 Folders

Back to Work

Merlin’s weekly podcast with Dan Benjamin. We talk about creativity, independence, and making things you love.

Join us via RSS, iTunes, or at 5by5.tv.

”What’s 43 Folders?”
43Folders.com is Merlin Mann’s website about finding the time and attention to do your best creative work.

Academic notes in one big text file + tag clouds

Back when I first started writing my dissertation (seems like another lifetime now), I used FileMaker Pro to store all my notes. After hundreds of dollars and a lot of headaches, I now realize that I could easily have used the command line and text files. And I would have ended up with a much more portable, robust, and versatile data bank than anything FileMaker could offer.

I present my quick and dirty note storage solution here as a command line newbie, so I would be grateful for any tips or help. Thought it might be of interest to all the grad students and academics out there.

1) I take notes on index cards. This limits me to "atomic" bits of information, thwarting my bad habit of keeping a massive stream of unbroken notes. (When I used FileMaker, some of my entries were several pages long, clearly defeating the purpose of a database).

2) When I process my inbox, I append these new notes to a growing text file. Each note occupies a single line and gets a unique date/time stamp. It also gets a bibliographic code (author-date of publication - e.g., Wallace03), which I can use to search a separate .bib file. (I save the index cards as a paper backup.)

3) I tag each note with several keywords, using "kw" to set them apart. E.g., kwrenaissance kwprinter kwstatistics kwflorence kw1490s- these might be tags for a note on the number of printers in Renaissance Florence.

4) I can now quickly search my notes using grep. If I want to look only for keywords, then I make sure to begin my search query with kw. So let's say I wanted to see my notes tagged "Beethoven." Well then I would type "grep kwbeethoven notes.txt." (I leave all my keywords uncapitalized.) Immediately all the relevant lines pop up. If I want a broader search, then I leave out the kw (and be sure to turn on case insensitive [-i]).

5) If I'm not sure whether I've used a keyword before (or if I'm lazy), I can always count on VIM's autocomplete feature. Let's say I want to enter a keyword for Roman Law but can't remember whether I've previously used kwlawroman or kwromanlaw. Well, then I simply type kwrom followed by CTRL-N or CTRL-P, which allows me to flip through all the other words beginning with kwrom in the document. I can do the same thing for kwlaw.

6) tr -cs A-za-z '\012' < notes.txt | sort | uniq -c | grep kw | sed 's/kw//' > keywords.txt

And finally, this ugly little command will pull out all the words beginning with kw, strip them of that unsightly kw, count how many times they appear in my notes, and put the results in a text file. I'm sure there would be a prettier way to do this, but it works OK for me. So I get a nice little tag cloud of all my keywords together with frequency, which looks like this:

2 1830s
3 1920s
3 abacus
2 ardvark
5 barbarians
2 beethoven
[and so on...]

I guess my only question now is how big my notes file can become before it gets too slow and/or unwieldy to work with.

I'm thrilled to discover the simplicity of text. It's free, universal, and so much more secure than my other solutions. And with grep and redirection, it's so easy to pull together a new, smaller text file of notes on a particular subject.

The nice thing is that I can always go from here to other platforms. At some point, I plan to split my big text file into single-entry files and import all these little files into DevonThink, so that I can take advantage of that program's "See Also" feature.

But I like the text file. Besides, I can access my database from anywhere using SSH.

TOPICS: Life Hacks
mdl's picture

Update - Notes text file

Thanks everyone for your responses to my text file note system. I sort of lost track of this thread for a while, but wanted to give you a bit of an update. Since I'm a command line noob, I'm learning as I go along. :)

Ish, in response to your question about exporting the notes to DevonThink, I have a couple of solutions. Note: both of them require that each note entry is limited to one line.

1) Split the big file on the command line:

split -1 notes.txt [path/newfileprefix]

This will split the big file into hundreds (or thousands) of atomic files, depending on how many entries there are. They will be labelled aaa, aab, aac, aad, aae, and so on. If you want them to be prefixed with something like "note" (e.g., noteaaa, noteaab, noteaac, etc), then you can add the word note to your split command. Unfortunately, DevonThink won't import these files unless they have the .txt extension. If you can add this from the command line, more power to you. Otherwise, you can use the Apple Script "Add to File Names" (under Finder Scripts). Unfortunately, the notes will also be imported to DevonThink with these highly non-descriptive file names.

2) My current solution:

I like my notes in DevonThink a little more readable than the entries this simple export produces. So I've updated my notes file to include "fields" that can be manipulated via the "awk" command. I use two colons "::" as a field delimiter (since I never type two successive colons in my notes - unless it's a typo!).

The notes are organized like this:

type::location(in my paper notes)::date::subjectline::keywordtags::body::source::#page

Here's a sample entry:

Q::NtBkA140::06-09-10::Bossy on printing and the word in C16 (dx1500c) Europe::fxhistory lxeurope txearlymodern pxerasmus pxmcluhanmarshall kxreformation kxprinting kxmedia::"To exactly which word the Christian community of the C16 was being turned is a question of some delicacy... Erasmus himself is surely sufficient evidence that the 'word' of the C16 was to a large extent the devocalised and desocialised medium whose emergence has been argued for by transatlantic media theorists in the wake of Marshall McLuhan."::bxBossyCW86::98-99

Thanks to those field delimiters I can rearrange the various fields in any given export using the "awk" command. I also like to strip the tag markers off for export. So I can produce an cleaned-up export that looks like this:
__________________________

Bossy on printing and the word in C16 (1500c) Europe

history europe earlymodern erasmus mcluhanmarshall reformation printing media

"To exactly which word the Christian community of the C16 was being turned is a question of some delicacy... Erasmus himself is surely sufficient evidence that the 'word' of the C16 was to a large extent the devocalised and desocialised medium whose emergence has been argued for by transatlantic media theorists in the wake of Marshall McLuhan."

BossyCW86, 98-99

NtBKA140
06-09-10
________________________

Although it's a little tedious, I now export all my new notes entries (usually once a week) to a single file, open it in TextEdit, and use the "Take Plain Note" service (Cmd-Shift-9) to clip each note to the DT database. If I do it this way, DevonThink will turn the first line of my clipping (in this case the subject line) into the title of the file, which makes everything much easier to read and sort in DTpro. As long as I do this regularly for new notes, it doesn't take too much time.

 
EXPLORE 43Folders THE GOOD STUFF

Popular
Today

Popular
Classics

An Oblique Strategy:
Honor thy error as a hidden intention


STAY IN THE LOOP:

Subscribe with Google Reader

Subscribe on Netvibes

Add to Technorati Favorites

Subscribe on Pageflakes

Add RSS feed

The Podcast Feed

Cranking

Merlin used to crank. He’s not cranking any more.

This is an essay about family, priorities, and Shakey’s Pizza, and it’s probably the best thing he’s written. »

Scared Shitless

Merlin’s scared. You’re scared. Everybody is scared.

This is the video of Merlin’s keynote at Webstock 2011. The one where he cried. You should watch it. »