HomeHuman MindsUnsatisfying Encounters with Academics of Renown, Episode 2: Noam Chomsky


Unsatisfying Encounters with Academics of Renown, Episode 2: Noam Chomsky — 5 Comments

  1. Chomski has done more to retard the field linguistics than any other educator. It’s really sad how he continues toactively ignore evidence that his theory is wrong. Thanks for this remembrance David. We should get together someday to chat about brain science. I’ve been studying it for about a decade now.

  2. David,

    What a great idea: To collect remembrances of exceptionally bad (or good) meetings with Academics (or even celebrities) of renown. I have had a few such experiences. I don’t, at the moment, have sufficient time to expand them into the kind of detailed story you have written here, but Richard Hamming and Ted Hoff stand out as deviating the farthest below my expectations, with Steve Wozniak, Robert Jones, Joseph Weizenbaum, Bob Hope, and Colonel Tom Parker following not far behind.

    On the plus side, P.A.M. Dirac, Seymour Cray, Paul Baran, Buzz Aldrin, Steve Jobs, Heinz von Foerster, Stewart Brand, Keith Henson, Heinrich Bohr, Theo Gray, Wolfgang Haken, Ted Nelson, Richard Dawkins, Ted Turner, Ricky Nelson, The Ramones, Todd Rundgren, Joe Walsh, Andy Johns (of the Glimmer Twins), Joe Esposito (“Elvis’s best friend”), and Kyle Gass all exceeded my expectations, usually both as gentlemen and scholars (for those to whom it applies, and, at least, constructive workers who weren’t afraid to get their hands dirty, for those who wouldn’t appreciate being labeled “scholars.”)


  3. DRW,

    Here’s something which I think does relate to your question:

    When I was working at RCA’s Sarnoff Labs in Princeton, one of the researchers from the original DECtalker neural research project presented a lecture. The “DECtalker” was an early voder peripheral made by DEC, and they had used it in a groundbreaking study of neural net programming. If memory serves, the pertinent neural net organisation (“architecture”) used a back-propagation algorithm to entrain 5 hidden layers of neurons. The training sequence was created by recording a few minutes of conversation between 2 second grade girls talking on the playground at school. The audio recording was transcribed to text, and the transcription was subsequently translated into a sequence of ASCII DECtalker commands (“phonemes”). The neural net was entrained by presenting the characters of the text sequentially simultaneous with presenting the target DECtalker ASCII representation of the corresponding phoneme desired in the translation. Again, if memory serves, the original transcript amounted to around 20 pages of double-spaced text. (It didn’t make any more sense than you would expect from 7-year olds.)

    As evidenced by recordings produced from arbitrary text presented to the trained net, even with only about 5 to 7 passes through the training sequence, the neural net seemed to do a pretty good job of translation. At the time, I found this extremely impressive, because only a handful of years prior, I had watched from a distance as Bruce Sherwood, one of the most brilliant people I have ever met, spent many months writing a procedural text-to-speech translator for the Votrax. The DECtalker seemed to perform about as well as Bruce’s program; being familiar with some of the pitfalls, when given the chance to test the DECtalker neural net text-to-speech translater in real-time near the end of the lecture, I couldn’t even trip it up with “night” and “ought” and so on. (It had taken months after the initial completion of the original program for Bruce to populate a table containing these kinds of “exceptions” in his program.)

    Finally we have generated the context required to present the question and answer which I think are pertinent to your question: In the Q/A after the lecture, one of my fellow RCA researchers asked if they had any idea how the neural net’s trained configuration might compare to human neurons doing the same task. Would all neural nets faced with the same task converge to essentially the same final configuration?

    The researcher (and, like your date, I wish I could recall his name) responded with something I thought was of fundamental importance, and that I will never forget. Although this had never made it into any of their publications, he said that he and some of his fellow grad students had had a similar curiosity. The initial “unprogrammed” neural nets were always initially configured with heterogeneous random weights, because if any in-to-out paths of the weights were homogeneous, the back-prop algorithm would adjust them equally. The resulting configurations (paths) would be redundant, and therefore waste resources, or not work at all. E.g. if all weights began the same as each other, after entraining, they may have all changed, but they would still all be the same as each other. Beginning from an initial ‘unique matrix of random neuron connection weights’ always produced a final ‘uniquely entrained resulting network’. Even though, for all intents and purposes, after sufficient passes through the training sequence had been applied, from a black-box perspective, all resulting networks behaved in essentially the same manner.

    One of the most fundamental strategies employed by humans faced with learning to read out loud (translate text-to-speech) — perhaps the most fundamental concept taught — is to differentiate between vowels and consonants. The grad students’ question was, did the DECtalker derive a neuron (or 2) which embodied the concept of “consonantness” vs. “vowelness”? The answer was “about 80% of the time.”

    In playing with this configuration, and entraining it over and over again, most of the time they could identify a particular neuron which corollated with the letter being presented to the net being either a consonant or a vowel. But the other 20% of the time, there was no grouping, differentiation, recognition, or partitioning of the net which corresponded to the character it was working on being either a vowel or a consonant. In those cases, the neural net seemed to perform its task just as well, but whatever it ended up doing, internally, to produce the “correct” result derived from classifications (“concepts”) completely foreign to humans’ approach.

    Although the domain of this study was narrowed to approximately the scope of individual Roman characters, and your question was about the much larger space of language in general, I believe that this provides an existence theorem: If even the same restricted-scope language task can be performed in incomprehensibly different ways, certainly when conemplating the larger encompassing domain described by the phase space of all inter-language translations (inter-comprehension) of languages in general the possibility exists that it will contain incomprehensible cases. The relatively larger size of the phase space describing all languages implies that the probability of language-to-language translations containing such incomprehensible cases is at least a large as that for smaller encompassed domains (such as character-to-phoneme.)

    So if you tried to learn to talk to a zingblort from Tralfamador, let alone a being composed of light-in-propagation from the Pleiades, they might not have a concept which differentiates between verbs and nouns — i.e. between that which changes, and that which remains unchanged.

    Closer to home, a fellow researcher once pointed out to me that the language of Physics was the language of differential equations. It is the current practice of physicists that, at the fundamental level, virtually all physical laws are described by differential equations. That, he asserted, is because physics is the study of things that change, the way that they change, and connections to anything else that changes at the same time. From this POV, at the fundamental level, the language of physics is a language consisting only of verbs. How do you study something in physics which NEVER changes? (How can you even describe such a thing?)

    So my final question is, can you understand physicists, and what does that tell you about yourself?

    Sorry to get so carried away, but it is an interesting question.


  4. Fascinating story, Sherwin.

    To answer YOUR final question: I can understand physicists only up to a point. And that’s probably because, having majored in psychology, I didn’t have to learn differential equations.

  5. David – I enjoy pondering the questions you pose and they way you have posed them. The closing question in particular is something I have approached through my own lens and language. Considering your post led me to discover a book review which you might enjoy. It ties “deep structure” concepts of Chomsky, Levi-Strauss, and Jung together in a way I am favorably inclined to consider.(http://egajdbooks.blogspot.com/2012/11/20121124-jung-by-anthony-stevens-read.html)

    Perhaps I will share my piece when it is ready . p