Sunday, 16 September 2012

Optical Character Recognition

As a high school student I am subjected at regular intervals to parent teacher interviews. I’m not too concerned about what the teachers say, I’m more interested in what the parents say about me. The thing is though, they won’t tell me. Not outright anyway.

Apparently at a very recent interview my parents voiced their frustration at me continued use of a manual typewriter (namely my beloved Brother 215), and they were very vociferous about my continual cycles of drafts – retyping – second draft – retyping – final copy. It is made especially bad by my continual rants about homework, and how it has no learning benefit, and if I was dictator I would ban homework, and put the Olivetti L32 and Brother 215 back into production. These rants get very simple replies from my parents: “all that retyping you do makes more homework for you”.

As an obnoxious teenager I announced that I ‘couldn’t write on a computer keyboard’. I raised a metaphorical pinkie and announced that I needed a typewriter to write my homework. It was mostly a load of nonsense, but at the time it sounded good.

Surely there’s a compromise, a way that I can keep using a typewriter and not have to retype everything over and over. The first obvious thing that springs to mind is the USB typewriter, but shortly after that comes to my brain thoughts of my measly savings account and how it would be in negative territory after I bought one, or the kit, come to my brain as well. Then there’s always outsourcing typing to India. I just scan it in, pay some miniscule amount of money, and then they send me back a Word document, all typed. Yet, this isn’t very ethical. The next obvious thing is Optical Character Recognition, OCR.

OCR is one of those things that works really well in theory, but can create problems in practice. I’ve been experimenting with it and I’ve found that PDFs are next to useless and create far too many mistakes, while high-res images work quite well; as long as you haven’t overtyped or ‘corrected’ anything on the typewriter. Could I do my entire homework by typewriting it, scanning it in and OCRing it? No, I don’t think so. There’s so much time spent fixing the mistakes that it isn’t really worth it. And I don’t have time to break the text up into little blocks of high-res jpeg. Sadly, I would say that in my obnoxious, computer-shunning state, retyping is the most efficient way to do it. Yet, I may get to the stage where I take out a mortgage and buy a USB typewriter, or have a go converting one myself. I’m trying to hold out against the computer keyboard for as long as possible.

14-9-2012 -- Brother 215

Because I’m a little bit lazy, what’s above is a typecast from my post about my Brother 215. I was going to write another one, but this serves the purpose. Below is the exact, uncorrected text after it had its characters optically recognized.

It's a funny tning being attached
to only one typewriter when I still
buy yet more. I am attached to the
others, too, but not in the same way
as this one. Sometimes I will force
myself to use another machine, just
to ‘keep it going‘, hut I always
return to this one. Its worn black
key-tops and a barely functioning
baak~space are almost tomforting to
me. Other machines feel like you
have to fight them to use them, this
one you don't.
This hasn't always been my type-
writer of choice. I've only had it
for just under a year. Before this
one there was another Brother portable,
now not functioning, and before it a
uery nice Smihih-Corona Galaxie
Deluxe. Both machines I still have,
but hardly use. will then, there be
a new typewriter on the horizon-
that takes over from.this one? Maybe,
I don't know. It will most likely
he a larger machine, a hit more rugged
than this Brother would he nice.
I'm thinking something Hsmmss, a
5000 perhaps?

As you can see it isn’t too bad and it’s all recognisable, but going through and fixing all the little errors could, indeed, drive me mad.


  1. Great post. I admire the stubbornness you have regarding your typewriter.

    I had the opposite reaction here in college. I got several A papers that were under the page minimum with comments like "I haven't seen a typewritten paper in years!" across the top. One professor was afraid to write on it because it seemed "sacreligious". Then again, these were all writing and literature professors, so they have an appreciation for the written word other educators may not.

    1. My history teacher liked my Industrial Revolution assignment on typewriters, that was typed on a typewriter... There's some people here that will still do a handwritten assignment, and that always raises some eyebrows, and questions: "why could you be bothered?". It makes the typewritten assignment almost 'modern'.
      Isn't it great that teachers are paying more attention to how the assignment was typed rather than what's in it...

    2. Haha, that last line got me. It is a LITTLE different in college but not much. I had gotten some papers back the professor hadn't even read because they weren't in MLA format so I had to redo them.

      I added you to my blogroll. You got a good thing going here.

  2. Insisting on a typewriter is a very fine form of rebellion, if you ask me.

    You do seem to be in a position where you need a USB typewriter. For most people it would be a curiosity, but it would make sense given your needs and desires. The kit doesn't cost a mint, but it does require skill with soldering that I personally don't have; if I get one, I'll have to practice first.

    1. It's the only teenage rebellion that I am currently embarking on, and most people think that I'm mad because of it.
      I can manage a soldering iron. That's my other interests: electronics, radio sets etc etc, so I've had enough experience with a soldering iron to know that if you touch it it hurts, and I've got a scar to prove it - right near my cuticle on one of my fingers.
      The only concern for me with the kit for a USB typewriter though is which typewriter do I do it to?

    2. One you like to type on, but not your favorite. And preferably something common. So something like the Galaxie you said used to be your favorite maybe?

  3. I find a free online ocr, it can recognize text from jpg, png, tiff, bmp and gif image.