Release Notes V 2.1 May 2016
- Re-wrote the word searcher. It was originally the generic one intended to simply match words from my text cataloging subsystem, instead of detect words in a random byte stream, so it was slow and not always correct for Machine Essays. Re-wrote to be very specific for this use, now much faster and will correctly find longer forms of words (i.e., will find "intently" and not just stop at "intent". Significant enough change to bump the minor version number, much richer output.
- Added a timeout for longer word searching. If some minutes go by without finding, say, an 8 letter word then the machine will re-set word length and go again. Variable time based on your CPU use setting.
- Added auto-saving code to insure an essay isn't lost in a power outage, which happened to me
- Increased the size of the characters seen to a really really big number.
- Reduced the maximum CPU utilization. With the new scanner the fans were spinning up because things were happening too fast. Now low medium and high are respectively 25%,50%,75% on a dual core machine.
- took out random word generation. Didn't make sense to the project in the end.
- Added a "Pause and save" feature when an essay gets longer than 10 meg. A dialog will appear at this point asking you to save.
- fixed some UI issues. Not style, but functionality.
Release Notes V 2.0.2 Jan 2016
- Added a option to make the minimum word length variable. NOTE: May be extremely slow.
- Words after a period (etc.) are now capitalized
- Spent some time optimizing the word scanner so it's a good bit faster
- Made the self-editing probability much lower
What Is It
This is a weird one, weirder than my other apps, believe it or not. Or, it may be great art.
This started back in 1981 when I bought an Epson MX-80 printer for my Apple ][+. Good printer, fun to play.
I got thinking about the old bromide of "an infinite number of monkeys typing on an infinite number of typewriters for an infinite time will write Shakespeare". Conceptually that made sense to me, but I was interested in even a small proof of the idea. So I wrote a little BASIC program to print random characters to the MX-80. I let it slowly churn until I had two pages worth of printing (slooooowly, this was 1981 remember). Took a yellow highlighter and started marking words I found. I was surprised to find that there were quite a few American English words in those two pages. "Interesting" thought I,"I would like to explore this further."
The press of time, the limited power of the computer available to me, and actual, money-making products to write shelved the idea. Always been simmering, but didn't take any more steps.
For some reason or other the idea resurfaced strongly here in 2013, and coupled with a my personal idea that these machines are fast approaching sentience,Machine Essays came to be.
What it does
Simply, Machine Essays asks the computer to generate blocks of random characters. The blocks are scanned for American English words, and if any words are found they are added to the output text block.
No gramatic or parts of speech rules are applied. The goal of Machine Essays is not to created automatic natural language output from input data and AI. There are many folks working on that, and doing a great job.
Instead, this is the raw, unfiltered, thoughts of your machine. I do "ask" the machine if it wants to apply punctuation after each word it generates, this is one area where there is a "rule", I bias this based on the length of an average American English sentence.
But maybe not. We'll see.
This is a major revision to Machine Essays. I’ve been thinking a lot about what this does (yes, I have, I think about this way too much) and I added two new generation modes.
The first came from thinking about how I write. I don’t generate a bunch of letters, then pick out english words from them, then write them down. I go through the list of words I’ve learned, pick one, and then write it down.
So the first new Essay generation, pick from list,mode does the same thing. When you configure the Essay, click the ‘Pick from word list’ tab to set that mode and choose the options available.
When you hit ‘Start’, your machine will then randomly pick words from the word list you selected. Basically it’s picking from a vocabulary it already has and outputs the result, similar to what I’m doing right now in my brain.
As you can imagine, this is a much faster process than scanning millions of characters for words, so I’ve throttled it quite severely, at its fastest it will choose one word every 1/3 of a second, at slow one per second.
May 2016 And I don't like it and am planning to take it out. If you do like it let me know. I might not care anyway.
The second, Both, mode runs these two concurrently, with a bias between one or the other that you can set.
Took all this out. Didn't like it.
Another thing I added in version 2 is editing. Your machine can now choose to edit its essay, delete words & phrases, add words and phrases, modify words and prhases. Effectively letting your machine go back and correct what it would like.
This is pretty seamless to you, the observer, with exceptions. When your machine chooses to edit by changing or adding words, then it of course has to generate those words that it wants to replace or add. Those words are gathered offscreen, then applied when the edited section is ready to be changed. That means your machine may appear to pause without any visible output for a period of time. I put a red blnking "Editing in progress" label in the header of the window to let you know that editing is happening, just be patient.
BTW, if your machine is speaking when an edit session starts, speech will stop and re-start after the editing session is over.
Version 2.0.1 allows you to turn off machine editing in the Configuration sheet if you'd like.
Machine Essays works this way for generating words from random characters.
A random block of 1,000,000 characters is generated.
The block is filtered for ASCII (oh, UTF8 for you kids) characters. Interestingly, that's about 1/2 of the block. You can see this raw buffer in the Raw Buffer tab.
Then the block is walked for American English words. When it finds one it puts it in the output window. Then punctuation may or may not be added.
When the block has been scanned, another one is asked for.
And on and on it goes.
The scanning happens in a background thread that you can set the priority you want, from about 25% of your computers processor resources to 75% during the scan. I have Machine Essays running on 3 machines, and it never gets in the way.
The pick from list method is clearly simpler, and you've probably figured it out on your own. It picks a random word from the word list.
Both does both at the same time.
Natural runs the random character scanner, adds words to it's vocabulary, then runs a slow pick from list to generate
Making it go
Click Start, and a sheet will drop down to configure the essay session. New options here in version 2.0.
Some options are for any of the three selection methods.
- The first option for Machine Essays is the word list to use to match against. If you want to modify or add word lists, go to the Tools menu, select the "Word List Manager" and the manager window will appear. The details of creating word lists are identical to the stand-alone app Words Inna File, so look there for the docs. I include two word files, one called EWOL list which is huge, almost 150,000 words, many of them obscure. This list gives your machine the most flexibility in expressing itself, but also will produce the maximum incomprehensible sentences. The second is a standard ESL list with about 28,000 words. Much more limited output, but probably more comperhensible. You also may notice an occasional word that doesn't appear to be in any list, that's freedom given to the machine to make it's own thing up.
- CPU Use tells Machine Essays how much machine power it should use. Low is slower, keeps your machine cool. High will spin the fans of a MacBook Pro. I recommend Low, just let it run. For the "pick from word list" option, this regulates the amount for time between picking words, from about 1 second to about 1/3 of a second.
- Chance of random word gives Machine Essays more chances to make up it's own words. At zero it makes up none, just finds words in the word lists. From 1-10 there is a 1%-10% chance that your machine gets to make up it's own word and put it in the output window. Very experiemental, I usually have it set to 0% myself, a friend always has it at 10%.
- Self-editing allowed. This is checked by default, and allows your machine to edit it's essays. Un-check if you want to see the continuous, raw, untouched output.
Options for picking words from the random buffer
Minimum word length sets, uhhh, the minimum leght of the word ME will pick from the buffer. I default this to 4, since a smaller number means a whole lot of 'a','it','of' and so on, not really interesting. 4 or higher is interesting. 1-3 letter words are managed by the short word manager
use file for input checkbox
does something, I'm not really sure what and why it's here.lets you look at any file and pick the words from it. Might be interesting if you're searching for ASCII hidden in a picture file, for example (you'd be surprised) Click it then select a file by clicking the select file button. When the scan starts this file will be opened and loaded, and one pass will be made through this file for expressions. Please note that large files with many non-words can take a very long time to process. Once the file has been scanned the process will stop. Note: because
of the way I pre-process the text buffer for scanning some words may not be found, and some words may be created by pre-processing, this is NOT a comprehensive list of words in a non-ascii file, just the ones I can find. Again, not really sure why I put that in here.
Options for word list picking
Only one. Since the word list ME is picking from has (or should have) all the short words that the Short Word Manager uses, this gives you the option to turn off generating short words. I keep it on because it seems to generate more natural sentences.
Options for Both
Again, only one. Lets you choose whether picking from a buffer gets more time, or picking from a list.
Machine Essays has an internal "random short word" function that will, after every word is found, see if the Machine would like to add a short word (a, an, it, ah, and so on) to insure those exist, but not have too many of them. Early experimentation showed that allowing the machine to get short words from the word list resulted in way too many short words, so I made this change. The machine still has the option to add a short word (it's the machines randomness that decides), but you're not overwhelmed with short words.
You can see the list, and modify it, by selecting the Short word manager menu item.
You can also change the probability value, but that's currently unused in version 1.0