What Can Computers Teach Us About Poetry?

Colossus ComputerThe idea that analysing poetry with computers could teach us anything about the art is controversial. A recent survey I conducted of more than 300 tech-savvy poets confirmed that — while they generally agree that technology has been good for poetry in terms of fostering community, creating networking opportunities, and providing remote learning — they would rather computer scientists keep the ones and zeroes away from their iambs and spondees.

Intuitively, this makes sense — after all, we write poems for people, not machines. Poetry is one of the most intimately human of activities. Yet analytical methods, properly interpreted, can reveal new aspects of poetry that we readers and writers might miss. Blind spots can be corrected, what we sense intuitively can be confirmed scientifically, and computers may indeed help us to see old words with new eyes.

Analysis of aesthetic matters must be conducted on aesthetic terms. For this reason, many of the recent computer analyses of poems have relied on data from psycholinguistic research, wherein subjects were asked to classify different words across a range of dimensions such as the concreteness of imagery evoked, the emotionality (positive or negative) elicited by the word, and how easy or hard a word might be to define.

In 2000, Richard Forsyth used statistical methods to analyse differences between some of the more successful (i.e. published) and obscure (i.e. unpublished) poems from well-known poets writing in English. Subsequently, in 2012, Justine Kao and Dan Jurafsky conducted a statistical analysis of “professional” (published in a reputable anthology) versus “amateur” (available on an amateur poetry website) poems.

Here are tables showing some of the statistically-significant findings from those two studies:

Successful Obscure
Fewer Syllables per Word More Syllables per Word
Fewer Letters per Word More Letters per Word
More Common Words More Rare Words
Less Diversity of Word Choice Greater Diversity of Word Choice
Professional Amateur
More Concrete Words More Abstract Words
More Likely to Use Approximate Rhymes More Likely to Use Perfect Rhymes
Less Alliteration More Alliteration
Fewer Highly-Emotional Words More Highly-Emotional Words

These findings make sense in light of contemporary trends in poetry since the advent of free verse. Now we have a statistical quantification of trends that literary critics have understood qualitatively for some time.

In 2013, Michael Dalvean extended the work of Kao and Jurafsky using machine learning instead of statistical methods, analysing the same base of “amateur” and “professional” poems. The approach of classifying types of documents based on what computers can be “taught” about their various properties is what has given us useful and effective spam filters. So, why not classify poems?

The findings from this more sophisticated analysis match the previous study in some ways — the “professional” poems can be identified by a computer looking for concrete words with low emotionality, just as the “professional” poems were found to have a statistically significant increase in these traits as well.

Here the findings diverge, however. The computers attempt at “learning” what an amateur poem is (or is not) also found ways to use prepositions (such as “the” and “a”) as part of the basis for its decision making. That is certainly not an obvious approach to a human critic. Combining several of these rules together, the computer achieved an 80% success rate in classifying poems correctly.

However, I believe it is a mistake to argue, as Dalvean does, that a successful computerised method of classifying “amateur” and “professional” poems into either one or the other could be extended to create a spectrum between the two, and thereby extrapolated out to automatically “rank” poetry in terms of its objective quality.

As a practicing poet, I know that beyond some of the “mistakes” that amateurs make in approaching poetry — such as using flowery, emotive, and abstract language — rating the quality of serious poets’ work is a highly subjective matter. Even some of our greatest critics cannot agree (though I suspect that any one of them could surely spot an amateur poem with at least an 80% degree of accuracy). So, while sifting out truly “amateur” poems may something we can teach a computer to do, we must take with a grain of salt the idea that machines will be “grading” our poems in a way that would match some kind of human consensus about quality anytime soon.

That said, by approaching computerised analysis of poetry for the sake of informing literary criticism, we may indeed make new discoveries. I recently conducted a simple analysis of more than 3,000 poems published in Poetry Magazine — arguably the most reputable American poetry magazine of our time. I used a computer to count and analyse the frequency of words. Comparing this to a similar analysis of poems accepted in the tenure of the most recent editor, Don Share, I was surprised to find that the same “kinds” of words showed up most frequently both in the historical and present-day analysis.

Some on social media incorrectly took this as a criticism, since these “poetry words” do seem, in isolation, like they are trying rather too hard to be poetic. In context, though, it was interesting to discover that such words only appear 11% of the time on average in any one poem. There is also a massive “long tail” (42%) of words that only get used once. This is interesting in light of Forsyth’s findings about the low overall diversity of word choice in successful poems. As a consequence, we might consider that these “poetry words” are like salt — the right amount enhances the meal, whereas too much almost certainly can ruin it.

It may be that we poets have always been chasing after some element of the sublime (represented by these “poetry words”), but succeeding in doing so in contemporary poetry only to the extent that we can reinvent these timeless themes (aided, in part, by the “long tail” of non-repeating words). However, this is not something we would necessarily conclude on our own as readers of poetry.

This is because computers counting words are not reading for meaning. They highlight the similarities where a human reader sees the differences. Could it be that such non-human methods can therefore give insights into decidedly human phenomena, such as individual and collective subconscious preoccupations?

For now, one thing is clear — there is much still to be gained by applying analytical methods, aided by computers, to poetry. The key is to supporting the poetry community with such findings is to interpret the results to the benefit of literary criticism, using machines to help us peek under the rocks that we might not otherwise inspect.

As computers become increasingly a part of the fabric of our lives, it can be tempting to try to keep them separate from the timeless tradition of language arts. Yet by embracing what is best about about computing in service to human experience, we have much to gain. A new breed of cyborg literary critics — human at the core, but enhanced by technology — may be able to tell us more about poetry now than ever.

Call me “RoboPoet”.


Unconscious Preoccupations, Machine Revelations

Turnabout is fair play. Having analysed several thousand poems from Poetry magazine, I have decided to turn the same methodology on myself.

I analysed 5,751 words from the 79 poems from my current pamphlet The Silence Teacher and my forthcoming collection The Knowledge.

Here are my top twenty-five most commonly-used words:

  1. air (27)
  2. light (26)
  3. eyes (23)
  4. day (21)
  5. water (20)
  6. night (20)
  7. face (19)
  8. man (17)
  9. hands (17)
  10. hand (16)
  11. life (15)
  12. place (15)
  13. head (14)
  14. small (14)
  15. world (14)
  16. sound (13)
  17. hold (12)
  18. fingers (12)
  19. love (12)
  20. long (12)
  21. late (12)
  22. white (12)
  23. blue (12)
  24. dark (11)
  25. call (11)

Ouch. Far from the nuanced poet I aspire to be, this reads like I missed my calling writing second-rate Raymond Chandler pastiche, romance novels, or a truly bizarre hybrid of the two. But again, the frequently-used words are actually used relatively infrequently in each individual poem.

In my case, 17% of words in top 100 make their way into any one poem, whereas a once-again-considerable 43% of words in each poem are never repeated in another poem. To put it another way, my average poem is 72 words long, with 12 words in top 100 and 31 words that are never repeated in any other poem in either collection.

Interestingly, my individual top twenty-five lists isn’t an exact match of the top twenty-five for the other poems I analysed. For example, the number-one word across more than 3,000 high-quality Poetry magazine poems, “time”, is nowhere in my own top twenty-five. I suspect, however, that if I were to analyse poems on a poet-by-poet basis from these 3,000, the individual preoccupations and concerns would start to tease out, and many poets in Poetry would stray far from the norm as well.

Furthermore, the specific concerns of my two books are very different from one another, as illustrated in the following word clouds:

The Knowledge

the-knowledge

 

 

The Silence Teacher

the-silence-teacher

So, individually, we’re all very different, from poet to poet and book to book, each with our own unique preoccupations. Yet collectively, when pooled, these preoccupations seem to converge on specific words.

What fascinates me about this is that purely analytical methods can reveal aspects of poetry that we readers and writers consciously miss. Computers counting words are not reading for meaning. They are able to highlight the similarities where a human reader sees only the differences.

Could it be that such non-human methods can therefore give insights into decidedly human phenomena, such as the individual and collective subconscious?

It would be interesting to analyse a significantly larger base of poetry texts, to see if they continue to converge on certain words.

For now, one thing is clear — these “Apollonian” words (as Dave Bonta so rightly identified them to be) make a frequent occurrence in all kinds of poetry, including fresh and lively contemporary poems.

Perhaps we poets have always been chasing after some element of the sublime (the top one hundred), but succeed in doing so in the postmodern age only to the extent that we can reinvent these timeless themes (aided, in part, by the non-repeating words). Emerson tells us that poetry, “must be as new as foam, and as old as the rock.” Perhaps the right mix of foam-words and rock-words is part of that endeavour.

Humans write poems for other humans, not for machines. Yet the mechanical analysis of text, correctly interpreted and contextualised, may indeed help us to see old words with new eyes.


No Such Thing as Bad Words

“The Difference Between Medicine and Poison is in the Dose”

-Circa Survive (song title)

In response to my recent analysis of the frequency of words used in past issues of Poetry magazine, current editor Don Share issued me a good-humoured challenge:

So, I analysed 395 poems from 13 issues of Poetry edited by Don Share from October 2013 to November 2014.

I was at first surprised to discover that the nature of the results are not substantially different than those of the nearly 3,000 past issues.

The average poem is 92 words in length (again, once stop words have been excluded), containing 14% of words in the top-100 and 24% of words that were only used once across the 395 poems analysed.

Here are the top 25 words:

  1. time (137)
  2. light (104)
  3. night (94)
  4. long (93)
  5. love (93)
  6. man (92)
  7. eyes (92)
  8. white (89)
  9. world (87)
  10. face (83)
  11. air (82)
  12. left (81)
  13. black (79)
  14. water (78)
  15. head (76)
  16. life (75)
  17. day (71)
  18. hand (69)
  19. people (69)
  20. wind (68)
  21. inside (65)
  22. sea (64)
  23. red (62)
  24. things (61)
  25. lost (60)

I found this surprising because Don is, by reputation and in my experience, one of the most interesting and innovative editors around. He’s undeniably on the pulse of contemporary poetry. So why do these words seem like they come from the poems of a century ago?

I think the answer is pretty simple: there no bad words in poetry, only the overuse of “poetry words” in any single poem. No single poem analysed used even a fraction of the top twenty five, and I know that on average the majority of words (80-90%) in most poems were not from these top words. Furthermore, a substantial percentage of words showed up only once across all poems, which demonstrates a high degree of linguistic innovation.

Cumulatively, though, these words do keep showing up in poetry (and in Poetry). What is equally interesting to me is the idea that a certain number — in fact, just the right number — of these words may be sometimes necessary to make a poem what it is. These words are like salt — a little bit seasons things, but too much can ruin the dish.

Frequently-used words are used frequently for a reason. These words are terse, expressive, and acquired early in life. They hold a power that, if overused, derails our trust in the author, and defuses our interest in the poem.

Yet they also seem to be some of the great workhorses of our language. So, to me the moral here is: don’t be afraid to use them; but don’t wear these poor creatures out.

Good poems make use of the range of our language the way good painters make use of the range of their palette. To scoff at a composer choosing C-major or a painter choosing pure red is to miss the essentials of technique, context, and intention.

For this reason, to me, there are no bad words, only words used badly.

(Click here to read an analysis of my own poetry using the same methodology.)


Top “Poetry Words”

Having counted the occurrence of words in nearly 3,000 poems published in Poetry Magazine to create a parameterised random word generator, I am making some other interesting discoveries about these words.

First, as one Twitter user pointed out, the words that come up at each “frequency of occurrence” setting on the generator have their own distinct feel, as if very different types of poets might gravitate toward different clusters of words:

I also created a word cloud using Wordle of the top 100 most-used words, which reveals the nature of these words:

Poetry Words

They are all words of one or two syllables, the likes of which you might find in high concentration in my early angst-ridden adolescent poetry journals.

What is interesting, though, is that these words do not appear in high concentration. Of the more than 300,000 instances of words in these poems (the average being just over 100 words in total per poem), these words occur just 11% of the time.

So, the “average” Poetry magazine poem (though, in truth, there may be no such thing) is 106 words long, and incorporates 11 of these top “poetry words” per poem.

Here is a list of the top 25 “poetry words”, with their word counts:

  1. time (944)
  2. love (831)
  3. day (763)
  4. light (732)
  5. night (725)
  6. man (710)
  7. world (696)
  8. long (677)
  9. eyes (631)
  10. life (624)
  11. water (527)
  12. hand (509)
  13. white (506)
  14. air (495)
  15. body (495)
  16. dark (486)
  17. face (477)
  18. dead (463)
  19. heart (451)
  20. years (443)
  21. left (443)
  22. god (439) [both capitalisations combined]
  23. sky (436)
  24. sun (432)
  25. wind (432)

Note that while “Man” is sixth, woman is 59th on the list. “White” comes in at #11, “black” at #26. And the poets all-time top obsession is, of course, “time” (and then “love”, in that order).

I also classified these words by type using the Wordnet database. Nearly all of the words are nouns or verbs, with only a single modifier showing up in the top-100 list. That word is “hard” (at #93 on the list). I guess your writing teacher was right to suggest that you avoid piling on modifiers for emphasis.

That said, the infrequently-used words are also considerable in number. In fact, words that occur only once in the nearly 3,000 poems analysed make up 42% of all the words used. I wonder if high-quality prose could boast an equally sizeable “long tail” of unique words. Clearly, part of innovation in language involves vocabulary.

As Emerson once said, “Every word was once a poem.” These days, we have a lot to choose from.

(Click here to read the follow-up, with more analysis of “poetry words” and their implications.)


Revolutionising Poetry with Technology (Survey Results)

p:\First and foremost, thanks to the more than 300 people who took a minute or two out of their busy lives to respond to my brief survey. Clearly people want to record their opinions, and hear what others think, about poetry and technology.

You can see the general report of survey results here. I have also charted and analysed this information below, with some interesting conclusions.

Intention and Methods

First, I should say that the intention of this survey was not to get a broad picture of general attitudes toward poetry, but to focus on specific aspects in a specific group. For a good general analysis, I recommend the Poetry Foundation’s Poetry in America study.

Now, a brief word about my methods. I posted the survey to my website and my social media networks, where it was generously shared by a wide range of established and up-and-coming poets. I also posted this survey to two prominent amateur writer websites, where the focus is on community critique.
Continue reading…


Demonstrating Faith in Humanity

What a day it has been. I woke up to the news that my beloved spiritual teacher and friend since childhood, John-Roger, passed away in the early hours at the age of eighty. If there is one thing he taught me, it is to keep doing good, no matter what.

Tonight my sister-in-law and our much-loved little nephew are boarding a plane back to Australia. For whatever I may have been able to impart to him in our two weeks together, he has certainly taught me much more.

In a short while, I will be carrying on with some of the good work I have found to do in the circumstances of my current life, by helping to produce a free, live online poetry broadcast. The show, after all, must go on. It is my way of reaffirming that the world is a small place, and that you and I are not so different after all.

I submitted the following article to Huffington Post Books yesterday, and it has come back to me today with all of these new resonances.

How Bedtime Stories Restored My Faith in Humanity

I never thought a slim paperback of children’s poems, packed with silly illustrations, sing-song rhymes, and bottom humour would restore my faith that printed books will endure. I had rather hoped for the seminal work of some brilliant, tortured Nobel laureate. But those precious few evening moments, while my nephew squirmed beside me in his bed, protesting against obvious sleepiness, confirmed that ours was a shared experience no touch-screen device would soon encroach upon.

Don’t get me wrong  —  he loves phones and pods and pads of every sort and, like me as a boy, becomes easily engrossed in the challenge of video games. The sense of individual progress, developing skill, and the spectacular multimedia rewards at the end of each level of “accomplishment” are tough for paper and ink to compete with by day. Yet when it comes time to switch gears from wakefulness to dreaming, the last thing he needs or even wants is a glowing glass slate crackling with sensory input.

Instead, we share stories and rhymes about creatures who slither and fart. We laugh. He points at the illustrations. As soon the poem chimes to an end, he asks for another. I begin to read more slowly.

We inhabit the sound of my voice together, a conduit between or two private experiences of the tale being told. As we draw further into ourselves, and into the music of language, we draw closer together. His breathing slows as he slips away fully into his own world, and I creep away, book in hand.

It could only really happen with a book  —  that portable, flimsy, shock-proof, battery-less, recyclable, spill-resistant, organic launch pad into ourselves. In fact, the more his generation inhabits the realm of flickering data on glowing blue screens, the more necessary the interior experience of a good book may become. Studies have shown that such screens promote a kind of restless insomnia, and even passively-lit pads like the Kindle still click my brain into the skim-and-scan gear I whizz through online. So, when it is time to stop surfing for sensory input, and reconnect with myself, I want paper and ink.

Books bring us back to our own imagination (after all, how many times has the movie of your favourite book disappointed you?), to the innermost experience of a tale being told, and to the music of the spoken word. The love of a good book is conveyed first and foremost as an act of love. And really, who doesn’t still love to be read to, at any age?

Traditions endure and outlast technological “disruption” when they tap into what makes us essentially human. There is nothing quite like reading a bedtime story from a printed book. For this reason alone, I have hope that the next generation, for all the amazing discoveries they will make though high technology, will still share some of their most intimate moments, and profound personal revelations, curled up with an old-fashioned book.

Thoughts? Remarks? Visit the article on Huffington Post.


The Paradox of Contemporary Poetry (Board-Game Edition)

There is a great paradox in contemporary poetry.

On the one hand, poetry seems to be dwindling — in bookstore shelves and traditional academic curricula — so much so that it has become fashionable for journalists to frequently declare it dead. On the other hand, I have but to scroll through my social media feeds to witness an eruption of poetry being written and published online.

Likewise, an offering like Al Filreis’ Modern Poetry online course has attracted more than 100,000 students eager to read and learn from great poets of the past. Furthermore, as a poet I know that even though the overall fan base for poetry may have dwindled since the advent of the Internet, that same technology allows me to connect with global audiences many times the size of what some of our most respected poets enjoyed as regional audiences one hundred years ago.

So it would seem that poetry is dying in the real world, only to be reborn into a kind of “Invisible Golden Age” online.

My own response to this paradox is equally dualistic. I acknowledge that poetry may never go mainstream in my lifetime, and aspire primarily for the respect of respectable peers. Yet at the same time, I work hard to bring poetry to new audiences, in person and online. In that vein, I have been gathering my thoughts about the fact that so many people are now reading and writing poems, yet poetry is still perceived as a floundering art. Really, how can this be?

The following diagram illustrates how I see people engaging with poetry today.

The Poetry Board Game
  • At the bottom left we have the non-participants, who read and write little. Often, somewhere in the course of their primary education, usually from a teacher they disliked or who disliked them, they got the message that poetry was difficult, irrelevant, or both.
  • At bottom right, we have the self-expressionists, who write much but read little. Many of us entered into this phase in adolescence, when what Wordsworth called the “spontaneous overflow of powerful feelings” turned in to our first attempts at poetry.
  • At top left, we have fans of poetry. Here we must distinguish between those who, like the students in Al Filreis’ class, are reading historical poetry, and those who read living authors as well. [Note: a student in one of Al’s classes recently pointed out that they do read some living authors as well.]
  • In the upper right, we have living poets. Reading and writing are the in- and out-breath of a life steeped in poetry, and the most prolific poets I know are also among the most voracious readers.

The boxes in blue represent the behaviours most likely to help usher us out of this “Invisible Golden Age” into, well, a visible one — that is, reading contemporary poetry as a fan and both reading and writing it as a poet.

It is pretty easy to see why the health and longevity of the art depends on these things happening. So how do we encourage such behaviour?

Think of the diagram as a board game. Your mission, should you choose to accept it, is help usher people from the gray areas into the white, and from the white areas into the blue. Fostering some appreciation of historical poetry, as well as providing some early creative outlet for trying one’s hand at writing the stuff, is usually best begun in primary school. Initiatives like California Poets in the Schools do a fine job of this. They move people into the white.

From here, the sheer volume of poetry being written, and the speed at which it races around online and even in print, can be daunting for new readers. What poets who write much and read little really need are mentors — poets who can read what they are writing and say, “Here, try this established contemporary poet. You might learn something from them about the kind of poem you are trying to write.” MFA programmes are one place where this happens, but workshop groups and tame poet-friends can do this too.

Likewise, readers of historical poetry need encouragement, based on their current tastes, to branch into contemporary poets. Like John Donne? Try Christian Wiman. John Keats? Try Li-Young Lee. For me, this started at university, but it is really never too early or too late to try contemporary poetry.

We may not be able to hit a “home run” by ushering people straight from the grey zone of non-participation into becoming overnight poets in the blue. Yet by first opening the doors to reading and writing poetry of any kind, then by acknowledging that contemporary poetry is largely a matter of taste, and trying to accomodate the tastes of newcomers with useful recommendations, we may well do our part to break contemporary poetry free of its current double-bind.

There is all kinds of evidence for the benefits of engaging more deeply with poetry — psychologically and even physiologically. Like every other contemporary poet, I know this to be true from my own experience. If, like me, you have been looking for ways to help others to find their way to poetry, I encourage you to have a look at the board, roll the dice, and join me in playing the game.

Thoughts? Comments? Join the conversation at Huffington Post