Jan 9, 2013

Going paperless: an update

It’s been a long road, and a long road remains ahead. Given the power of computers, we were promised the “paperless” office. It never happened. I suspect like most people I have more paper to deal with than ever.

A long while ago, after a major move, I realized I had a lot of paper. After my wife then moved more boxes into our home, I realized we had A LOT of paper. We had to do something. I bought a discounted all-in-one printer/scanner and started on my documents in the basement. Although I’ve never quite got everything done, our basement has a lot more room… to fill up with more paper.

Last month, I looked at the crystal ball and saw what was coming… tax time! We didn’t have any good procedures in place to keep up with our finances this year (as usual). We start well, but when we hit some sort of bottleneck or roadblock we tend to not do a thing until we have to. When tax time rolls around, then it’s a month or two of pure panic trying to get things in order. Our routine has been sort papers, do a bit of entering, put papers away, resort papers, enter, put papers sort of away, and around it goes and it becomes a big mess that we never really clean up.

This year, I decided the paperless route is the way to go. Sort papers only generally to scan, them put them away and don’t bother with them until you archive or destroy them. But how to do it? Luckily, a software developer provided a copy of the digital book “Paperless” by MacSparky (aka David Sparks). It’s a great book with lots of videos to explain how to do things. It was certainly a leg up, not only in software and equipment, but also thinking about how to do things right. That said, his ideas didn’t entirely suit my issues, and I suspect the same for most people, but it will definitely help you on your way. It’s available as an iBooks book or PDF.

Here are the lessons I’ve learned thus far:

FUD (Fear - Uncertainly - Doubt). FUD is an term used often to describe someone’s negative opinion when it tends to be unwarranted. Here, I use it to describe what will kill your will to do anything like going paperless. If you fear your system, if you are uncertain about how to do things, or have doubt whether it’s working - you will fail if you don’t find a way to identify and fix the problems.
Automate, automate, automate! The greatest source of FUD is your brain. Minimize the number of things your have to do and let the computer do the rest. If you have to do everything (scan, OCR, rename, move to a folder, deal with the paper) you’ll go mad and make lots of mistakes. Let the computer take some of that load off of you. You might need to learn computer automation for your system or buy software solutions to help, but it’s worth it.
Capture everywhere! Once my initial scanning is done, the reality is that we won’t keep it up if we don’t make it easy for ourselves to do. In the “Getting Things Done” mindset, there is the concept of “ubiquitous capture”, or the ability to collect needed data almost anywhere. It’s a bit harder to scan something anywhere, but it’s easier than you might think. We have scanners in the office, kitchen, and for elsewhere we have cameras, such as a new iPod Touch. Furthermore, each of the solutions that are outside the office can automatically (or is it automagically) get the scans onto the computer and some initial processing (such as OCR) without me doing much of anything. If you can’t capture in as many places as possible, those holes will often create more FUD.
Get a stamp! I recently bought a self-inking stamp that simply says “Scanned”. This, on the face of it, seems a little excessive. However, not instantly knowing whether or not a document is scanned causes FUD. Believe it or not, I have had documents go through my system in the last month that are very very old, possibly even scanned. I simply did not know if they were scanned or not and it is sometimes quicker to scan than to hunt it down. Today, if I don’t see a stamp, it’s not scanned.
Trust, but verify. Like many people, I know that computers can seriously mess things up. Don’t ever assume that after you scanned a document, it successfully made it through your system. Make sure that it does. Don’t stamp it until you’re sure!
File Creation Date. If you can set the creation date of a file, set it the date the original document was created. In fact, duplicate that in the file name (see 7 below).
Name (and tag) things consistently. Work on a good file naming convention and stick to it. If you use tags (I use the OpenMeta standard on a mac), be consistent there as possible. Keep in mind that tags and file names, however, are sometimes fragile and can be altered unintentionally. So, be aware of potential pitfalls. My convention looks like this: 2013 - 01 - 01 - meijer, groceries, ccMYCARDID.pdf I start off with the date of the receipt or document (2013 - 01 - 01 -), the store or source of the document (meijer), then categories (groceries), and finally, if it was a payment, some way to identify a credit card or cash (The “Paperless” book is the source of at least the date part of this system). The name can get excessively long, but on most modern OS systems, it shouldn’t be a problem (and the files will sort themselves in order by date even if you don’t set the creation date of the file). Believe it or not, I don’t manually set the file name, I use an automated system. When my automation triggers to put a file in its place, it reads the creation date and the list of tags and renames the file accordingly. I don’t have to think about the file name at all. The “Paperless” book shows a few ways to do this automatically using OCR’d text, but his exact system doesn’t quite work for me.
Identify bottlenecks. My main bottleneck is my email (yes, the already paper-free part of the system). I get hundreds of email every day. In the middle of all that, there are some things I need to put into my system, like bills or receipts. If I don’t deal with them right away, then they get lost in the weeds. Worse, once I do find them, I don’t have a way to mark them as “scanned” or “filed”. I need to fix this. I could switch to google mail and their web interface (which I don’t like that much), switch email apps, add plugins to my existing app, or set up some complex smart folders. I’m not really happy with any of those, but I’ll have to make a decision soon.
Backup. Well, duh! Oh, wait, I haven’t done it yet! Once it is, though, I’ll be making several copies to stash around to protect the data. Given the importance of the data, it would be bad to lose even a small portion of the data.

Software and tools

I’m a mac guy, so much of the below is mac-related software and hardware. However, some of the below will also be useful to other platforms.

1) Data Storage

My document storage device is a Drobo. A drobo is similar to a RAID, but it is a bit simpler to use. The Drobo is designed to limit the damage of drive failures. Believe me, I’ve had drive failures (at least 3 in the Drobo itself). While it protects against drive failure, you still need to backup.

2) Backup

I use spare hard drives and optical disks (although I’m moving away from optical). For hard drives, I use an external drive dock - I can just plug in a hard drive as if it were a floppy and back up the data. I am considering cloud-based backups, but I have enormous amounts of data and it doesn’t yet seem that cost effective in my case.

3) Backup storage.

I use 2 fireboxes and each is stored in a different location away from my home. Documents and photos are extremely important. No point in tempting fate.

4) Scanners and Cameras

You can use both scanners and cameras to ‘scan’ documents. Scanners are a bit more precise, but a good camera does the job too. Here are the devices that I currently use:

Epson Perfection 1660 Photo - an old, discontinued flatbed scanner I often keep in the basement in storage. I bring it out whenever I need to do a lot of scanning next to my computer.
Epson Workforce 645 All-In-One - It’s a great device which has a document feeder and can do duplexing on certain document sizes. This is a great workhorse device to capture many normal documents
Doxie Go - This is a small, battery operated scanner. Doesn’t do duplexing and it doesn’t have a lot of options. This is our “kitchen” scanner. The Doxie’s role is to allow for capturing of important mail and receipts as they come in. Using a standard memory card, I’ve been able to scan between 100 to 170 receipts in one sitting. If I use a wifi-enabled card (which we do while it’s in the kitchen), I get only about 40 to 50 scans before needing recharging. This isn’t a surprise, since the wifi takes a bit of power to operate. It’s certainly not a workhorse scanner, but not needing a computer at all to operate is handy.
Pod Touch (5th Gen). The new iPod has a decent-enough camera to “scan” receipts. It’s harder to use on full size documents, but can still do the job. The advantage of using the iPod as a scanner is that I can use it anywhere (in the house or out) and scan things the moment I get the receipt (in case I lose it!). I don’t recommend older iDevices since the camera isn’t particularly good for this role.
High-megapixel cameras - while we haven’t tried it, we have a couple of cameras 3 megapixels and up. In theory, these could do the same job as the iPod and “scan” documents. This approach may require mounting the camera in order to stay stable. Plus, the same wifi card that is in the Doxie can also be used in the camera.

5) Cloud Services

Dropbox. Dropbox is essentially an internet sharing tool that allows you to share with all of your devices (and with others if you choose). The iPod in particular uses this service. When I “scan” a document with the camera, I send it to a specific folder in dropbox. After a few minutes of internet and computer magic, a scanned and OCR’d version of the document is ready for tagging and filing. it would, of course, be better if I used an iPhone or an Android phone for this, but I don’t have one.

Eye-fi. I originally didn’t think of this as being a cloud service, but it really is. An eye-fi is an SD Memory card with built in wifi. Once it connects to your network, it will upload all the new images to the eye-fi site. You files will then make their way down to your computer. The nice thing about this is that we can use the eye-fi for both photography and scanning. However, it mainly lives in our Doxie Go.

6) Software - iOS Based

There’s only one bit of software I’m using in the system on iOS - Jot Not. It’s a app designed to taking pictures of documents and then allows you to transfer those documents to other services like Dropbox. As a matter of fact, I was waiting for my wife in the car today and asked for any receipts she might have… boom! done.

7) Software - Mac

OpenMeta - OM isn’t really software, it’s an open standard for tagging files on the Mac. It is not supported by Apple and not all applications will respect the tags (i.e. if you do something to the file, you might lose your tags). There are a number of apps that support OpenMeta, some are free. However, since this is my main method for tagging files, I try to use apps that will support, or at least respect, OM.

PDF Pen- PDF Pen is a step up from the standard PDF viewing app on a mac. Most importantly for this system is that it will OCR documents (and is apple scriptable). PDF Pen will also scan files. However, it doesn’t always respect tagging. As a result, I do all the work needed using PDF Pen BEFORE I tag the file.

Yep - I’ve been using this app off and on for years. It is a PDF manager and viewer. It’s very useful in particular to organizing documents. You can OpenMeta tag and set the creation date within YEP. Yep also will scan documents. Yep is my choice when I’m scanning non-standard document sizes or shapes.

Hazel - Hazel is one of those apps I’ve tried over the years and never found a use for it. However, that’s changed. Hazel is, according to the website “Automated Organization for your Mac”. Exactly what you need to file documents. It watches different locations on your system to see if anything matches a rule, if it finds a match, it executes a rule. I use this functionality in two ways. First, I use it as a funnel for new documents. Since I capture from a number of sources, they tend to appear on my system in different places. Should a file appear in one of those places, it converts to PDF (if needed) and sends it to another directory. In that directory, the PDFs are presented to PDF Pen for OCR (using an applescript). Once done, it moves the file again to a waiting area. There I manually sort the documents - some need to go into business, work, or personal. It’s at this point, I tag and date each file, but I don’t rename it. Instead, I move it into another folder Hazel watches. It determines whether it knows what to do with the file based on the tags. If it does, it automatically renames the file based on its creation date and tags and moves it into my file archive where it needs to go.

rsync - open source command line app to back stuff up! I may choose another app later, but it’s good enough for now.

Final Words

While my approach is great for me, it certainly isn’t what’s best for everyone. Take a look at the “Paperless” book or some similar book (“Paperless” is cheap and comes with video!) and develop your own system. It might be worth it in the end. Just remember, once you start, you must keep it up if it’s going to be useful. If the system is too hard, it will fail.