Artistry of Neural Nets – Language

This post is a fun look at neural nets by exploring the artistic potential of neural net learning algorithms. 

In this second neural net post, we will explore the artistic world of natural language.

H2O Prediction Engine

For large data sets, the H2O is a prediction engine that works very well in R and brings Massively Scalable Big Data Analysis capabilities.

H2O

Setting up h2o and using it is fairly simple:

require(h2o)
localH2O = h2o.init()
h2o.show_progress()
df.hex <- h2o.uploadFile(path = "train.csv",destination_frame ="df.hex",header=T)

Surprising amount of common R functionality works flawlessly.

nrow(df.hex)
summary(df.hex)
names(df.hex)
mean(df.hex$target=='_')
mean(df.hex$target=='?')
mean(df.hex$target==',')
mean(df.hex$target=='.')
length(unique(df.hex$target))
num.levels=length(h2o.levels(df.hex$target))
nodes=nrow(df.hex)/(2*(num.levels*bufferLen+1+num.levels))/3
n=names(df.hex)

If a common R function does not work, trying “h2o.” as a prefix usually fixes it.

Training is simple and one just lists column names:

h2o.show_progress()
model <- h2o.deeplearning(x = c("columnName","etc"), # columns for predictors
 y = "target", # column for label
 training_frame = df.hex, # data in H2O format
 hidden = c(nodes,nodes,nodes), # three layers of nodes
 epochs = 300) # max. no. of epochs
h2o.saveModel(object = model, path = "sassy.model", force=TRUE)

Prediction is done via the h2o.predict function:

fit = h2o.predict(object = model, newdata = as.h2o(r.df)

Auto-regressive structure of natural language

Insight: Sentences are somewhat auto-regressive.

brownfox

arima-learned

The above prediction suggests another word could follow.  With a more sophisticated method, perhaps actual words could be generated.

To create a training file, a sliding buffer acts to hold the last N characters with the target character to be learned as the N+1 character.

Sliding Buffer mechanics:

easy easy2

To simplify the learning:

  • all characters were made lowercase.
  • punctuation was greatly simplified:
    • semicolons become periods.
    • periods and question marks preserved as sentence terminators.
    • all other punctuation stripped.

In the following, another input value called “spot” is added which is the number of characters since the last sentence ending.  That was added to help to reduce run-on sentences and isn’t strictly needed.

trainingfile

This produces a training file with about 1/2 million rows.

Results

Should versus Ought

The Brothers Grimm stories were fed into the H2O’s deeplearning function with an interesting result: Occasionally as a word “shought” was used.

In the corpus

  • should appears 194 times
  • ought appears 18 times

According to the Oxford dictionary they have nearly the identical definition:

Ought: Used to indicate duty or correctness, typically when criticizing someone’s actions

Should: Used to indicate obligation, duty, or correctness, typically when criticizing someone’s actions

So in some context, the deep learning neural net could not decided between should and ought.

Commonly however, the result from the Brother Grimm stories was less than satisfactory:

What are golden apples? asked the wind the work the work, and the wind the work, and the world to who said, and the work the world to when the work, and the world to who said, and the work the world to when the work, and the world to who said, and the work the world to when the work, and the world to who said, and the work the world to when the work, and the world to who said,…

Sounds similar to someone who has worked too hard in one day!

Nursery Rhymes

More exploration was done with a compiled set of nursery rhymes. Nursery rhymes was chosen due to the simplistic structure. For example:

THE QUEEN of HEARTS
The Queen of Hearts she made some tarts,
All on a summer’s day;

The Knave of Hearts he stole those tarts,
And took them clean away.

The King of Hearts called for those tarts,
And beat the Knave full sore.

The Knave of Hearts brought back those tarts,
And vowed he’d steal no more.

When I was a kid, my friends and I would have added a stanza or two.

Computer Generated Nursery Rhymes

The following generated nursery rhymes were seeded with the initial phrase in black and with the following indented text being generated character by character.

Once upon a time there was a dog who lives in a castle.
And the bears an a but to the was an and to buy a little beary better.
An a came to be the bernow the bell,
and a little beary the better bet brides the better the beace a butcher beard there and beard there a little beach but were a but the blard the batter bell be a do the little beach bettle bere to the cat the came to the bear the batter be the water berter.
There beet to bed an a little briddle a canter begs they betthe harter the beater the bring the caltreet a builn,
and the bade a cander a build a but with the bat begain.
The call the came to the callow.
A will to be and the call the cow,
there a cave a little better the was an a little badly the better begaver the betteed the took away,
the little bed away.
The came a little bat,
the call went to the little be the bell,
and a little be the cander and beace and the stom a beaver wenter beater the carry butter them the with a little be the berters and the beard the call and a but when the bill,
and a little beat the pleat
the bridd away
and the bader to the she beater bird.

What about the queen of hearts?
The betteen the better the beather the beday
and the bet beat the still again.

A bure to be the stom a little bell,
and a little bady,
the care a little bear.

And the be beat a little be the bet beat the bady,
and the bere to be the bery
and the better the beat.

The bag the little be the bat the better the berter the betteed the bady,
and all the came to the crown and a will be the bell.
And the better the betteed the better the brides a cat beat to be a beger to be there a crown a crown,
and a little beads a little be a to be the better bet beat at a little bell,
and a little bell,
and a little bell again.

A little a came beave a pay a but and the came to the came to beard and the bader bridenter betther better better better better better the bread a little better beaver the call that a batter bear to belowing a little beas a could the batter bead away
the butter to the ladder bell better betteet to berther beave.

A baby and beat to black and to be bere be were a little bady,
and the came to be the bered the little bell the came the little bear a cried a can he with a little beat to the was a black and the call the beater beather the will to the bather bettenot the bern,
and the bird the call that a graw.

Looking at the above text, one can notice a little bell mentioned:

and a little beads a little be a to be the better bet beat at a little bell,
and a little bell,
and a little bell again.

Interestingly, in the corpus, “little bell” was never mentioned. Bell is mentioned 87 times while little is mentioned 803 times.

Once upon a time there was a dog who lives in a castle.
When the little man and the little been the wall a put a gan and the little girl to the cat to married a little man
and a little man,
that boy dow the sout and wate a four and sonny they was a shoot another to the little tone,
and they went whis would wall,
and they been the lover to the lander wilk.
The many the little to the cander a little sight lent to the little giver a little pig a little boy lives and the little pig a would not the was she was a little boy a little goos a cand a crow to the little boys are to the canting a could to will to the little ginger and who see the sailing and the little boy dour a back and the came
and a breat and the play
and when a cant a came
andon a long,
that a little bonn.
Then and the lade and they stoll
and the little mant all the man,
and the little been and a little to merried a little sigin.
The morning to stock.
There and the breat to there a crow to make a gole and the came and when the hand a crown
the she was a love the she cant and there a charter and they the ladded and they to the cant and then all the little give thee to to the stole she was a came a beees a litter and a little boys and when where a crown.

The following two “stories” while seeded with two different phrases began with the phrase “The little man”.  The two stories originated from two different deep learning models. The phrase “little man” does appear 15 times in the corpus.

Do you see?
The little man,
that be the to me pretty blow and the little begory and the stoom the lady,
that the would with the little gine a gaid to merried and was a loves the would she shoot of a spone the lover a brown,
and the little to the came and the little bee a little boy the was an the london and the soul and the little tombleppy the cat,
that bearty began and the came
and the say a little boys a for the was a little bone,
and the man the pretty can will a little boys stick.
The mander and the store and the poor the little ging a stick and the cant a preat at a little boy breake and they she was an the water and who she piesed a preatery heat a little bone,
and the little pig a to the polly pight
and a little boy preat and the land a preater and which a stain,
and they song a stick.
And then wall were the little boy the cats and the shoot of a spone to the crown and when the stown a came hop.
There all the little boy the little boy the shall and the she was a little boy the water
and the came
and the little beg a little boy little boy the water and the sever and the preat a little pig,
to the came and sone a find,
and they there a stool a pretty breat and who said they the little bone,
that woll with a gone a beees and a wind a can tour the can the house a can to the little tome to the cat.

hickory dickory dock, the mouse runs up the clock.
The little man,
the land to the water and the shoe,
when the little goll,
and the little been and they the should they the house that was a little pig a stick and sailing and the wall a cant a beg ard the little ging
and a man a little boy little marry a little sing and when a canterrow the starle and a good there a can and the shoes.
They was a lady,
the lasters the little man a long,
and was songer and with a gan to the came
and the shoe to the lover the said the little pig.
The came and the little girl to merry she sailes and when all and there a crow that he sailing was a little man,
and when and they was a lady,
and when the little girls still
and the they to the little came
and a little boys a cander and they the land a go the said the came a breaves a little boys and was a little beg a little good a crown.
The little bont there a can and the little been the walled a little boys the water with the water and the little give and the water and the came a crow a pretty benes all and sever a living a spoor the stook,
the said a little came and when with the preaty the can the land a mather away an i she wall the bought and the bring a littout a could the man the living a whis a canted a litting a wonder a crown all the cant
they hould whit worring and the bring and which
the wond,
that can a littie back a and the litting the can the came a whill the little back and who sonny cay anny there a canton a little bone,
and who hore awas a go the four

your conner.

The starting phrase “Do you see?” was used once again but this time with the newest neural net and notice the story generated is far different.  It is suspected that the initial conditions (initialization) of the deep neural net is the cause for these differences.

Do you see?

At
and a good a down,
and they been a longother an a little boy she wall to the little give to the creather and was she wat a little boy breat to to longing and a gain a stone to mother and the son the came and when a lough a little boy a for that a little the to the was a little girler a thine and the pretty mandy little gind a could sheary had was a spores and a polly little boys
that can a shoot longal to the lives are a should they little the little bone,
and the little boy
and a little thought and a can and a littone the came
and a litton the stoen.
The land they said the little bone,
and the manden and the ling,
to will the little gong,
and the little girl,
the sever a little boy the mant and the good a cant and the little boy the catsrow to the came
and the light home
and a little man,
and the pretty beet and the cample to the man the came a can and the could them wond,
and there a can the can to the brown and who sonny has allul the little beet an a little boy browner a poll.

Random Seeds

Interestingly, when a random seed is given, the neural completes the “word” and begins telling a “story”.

ifpdzinqsitwxnsaepkmvikvzyvceofudgrpmorbed a little house and the cound the shoe the came and they was shole to beather a stock and the little mistle beg
and a land,

rshypfpzrevhszwdiboiptpqlxhfrhnmcgomqtmckin the chill the carried a crotion

and the little pig a little gone.

Future Exploration

Surprising for the most part the trained neural net is capable of generating unique text and for the most part spelled correctly. It is frankly amazing that it has worked as well as it did.

Avenues for future exploration are:

  • Not simplifying the training set by stripping less frequent characters.
  • LSTM Neural Nets – an exciting approach.

References

Crane, Gilbert, Tenniel, Weir, and Zwecker (1877) “Mother Goose’s Nursery Rhymes” https://www.gutenberg.org/ebooks/39784

Hobs (2015) http://stats.stackexchange.com/questions/181/how-to-choose-the-number-of-hidden-layers-and-nodes-in-a-feedforward-neural-netw/136542#136542

J. O. Halliwell-Phillipps. “The Nursery Rhymes of England”

Oxford University Press (2016) Oxford Dictionaries http://www.oxforddictionaries.com/

Project Gutenberg. “Young Canada’s Nursery Rhymes by Various” http://www.gutenberg.org/ebooks/4921

Walter Jerrold. “The Big Book of Nursery Rhymes” http://onlinebooks.library.upenn.edu/webbin/gutbook/lookup?num=38562

 

Post a comment

You may use the following HTML:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>