My Plagiarism

I was going back over yesterday’s post, and something sounded familiar about this paragraph:

A very careless plagiarist takes someone else’s work and copies it verbatim: “The mitochondria is the powerhouse of the cell”. A more careful plagiarist takes the work and changes a few words around: “The mitochondria is the energy dynamo of the cell”. A plagiarist who is more careful still changes the entire sentence structure: “In cells, mitochondria are the energy dynamos”. The most careful plagiarists change everything except the underlying concept, which they grasp at so deep a level that they can put it in whatever words they want – at which point it is no longer called plagiarism.

After rereading it a few times, it hit me. A few days ago, I’d come across this quote from Miss Manners:

There are three possible parts to a date, of which at least two must be offered: entertainment, food, and affection. It is customary to begin a series of dates with a great deal of entertainment, a moderate amount of food, and the merest suggestion of affection. As the amount of affection increases, the entertainment can be reduced proportionately. When the affection IS the entertainment, we no longer call it dating.

I laughed at it, I thought it was great, and I stored it in my head as the sort of thing I should quote at some point in order to sound witty.

And although I wasn’t consciously thinking about it at the time, I’m sure the last sentence of my paragraph comes from the last sentence of Miss Manners’. It would be easy to dismiss it as a coincidence, it probably seems like a coincidence to you, I can’t explain how I know that the one comes from the other, but when I replay in my mind the process that made me write that, it’s obvious that it did.

This sort of thing happens to me all the time. It’s just that it’s especially ironic when it happens in a paragraph about plagiarism, in a post about how writers blend everything they’ve read into a slurry and spew it out, somewhat transformed. I wrote that “the difference is how finely you blend”, and this is a not-so-rare example of my blending so coarsely that identifiable chunks of my sources have ended up in my own text.

Sometimes I identify turns of phrase that I’ve picked up from other people. Other times it’s more subtle; a style, a way of looking at the world, a method of reasoning. All of these are just different levels of pattern. My writing style is a slurry of the writing styles of everyone I’ve read and enjoyed, with some pieces chunkier than others. I think my worldview and my reasoning style are too, it’s just less obvious.

This entry was posted in Uncategorized and tagged . Bookmark the permalink.

55 Responses to My Plagiarism

  1. dacimpielitat says:

    that’s why I think some sages are saying the knowledge is everywhere, is of anyone, and can’t be owned by someone (e.g. patents and all), we do not invent anything, we just tune to an already existing wave and we use what we can “hear” on that wave; we are capable of tuning (like a radio) on different frequencies/waves and use them in the same time interlaced; the waves/frequencies are created collectively, we are all capable of tuning in and “hearing” them, only that some of us are using this capability better than the others, same like any other skill.

    when it comes to AI, and artificial neural nets, we are very far from even matching our brain, let alone we as human beings are more than just our brains; so even if we’ll do match our brain we will be at the phase of just scratching the surface.

    and speaking about scratching the surface:

  2. RavenclawPrefect says:

    Given that doing this is inevitable and the only way we really can create new stuff (everything is a remix, after all), what are some sources of content that you’ve found valuable to incorporate into your personal slurry of phrases, styles, and snippets?

    I’ve found myself using variations on a lot of dialog from Questionable Content in various social situations, especially those lines taken from its golden age of amusing banter and rapid wit (comics 2000 and earlier or so, in my opinion). To the extent that I’ve adopted any of the speaking styles of characters in Winston Rowntree’s masterpiece Subnormality (long example comic here), I also find that it’s a positive change to my speech and/or textual communication.

    Outside of webcomics, SSC itself is a source of good examples on how to write engagingly and compellingly on almost any topic (Scott’s insertion of micro-humor especially is fantastic); I also find some poetry to be a good source of particularly nice-sounding turns of phrase. But I’d love to find more; do other commenters have suggestions for distinctive and high-quality writers?

    • Scumbarge says:

      as a fellow webcomic wit appreciator, try bad machinery or achewood. Both take a bit of getting used to (achewood especially) but have some really unique styles and delightful turns of phrase.

    • AG says:

      I end up picking up some tics (both dialogue and body language) from people/characters I’ve been spending a lot of time with. This is good when it’s, say, a popular streamer or Youtuber, as whatever they’re doing has already been shown to be appealing to a lot of people. Less good if it’s, say, a murderous villain character from a TV show I’m binge-ing, as that actor might only be able to get away with those tics due to their non-replicable charisma.

      However, this can be somewhat moderated by seeing which of those tics can be re-created by fans across various meme formats, as that shows what aspects can be done by other people. So, ironically, in that mediocre fanfiction all converges on the same bland character/relationship, regardless of source material, it does point at a relatively universal personality type that mediocre fanfic writers prefer. (And the population of mediocre fanfic writers would be more likely to have mainstream tastes than higher end writers, who have more…artistic tastes.)

      • ec429 says:

        Whenever I watch an Oscar Wilde play I start talking in the ‘voice’ of his characters, and this is distinctive enough to be recognisable. I’m not sure whether this is a good thing, but I don’t seem to have any choice in the matter (it happens by itself) except by taking the drastic and unpleasant step of not watching Oscar Wilde plays.
        I think there are a few other authors that can ‘colonise’ my mind in this way, but Wilde is the most striking example and the only one I can remember right now.

  3. Faza (TCM) says:

    I think you’re reaching here, Scott. What you have here is a case of “language as she is spoke” – in other words: we learn to use language by observing how it is used by other people. Indeed, unless people copy usage from one another all the time we wouldn’t be able to arrive at a mutally intelligible language.

    This example doesn’t lend any support to the proposition that copying other people’s expressions is all there is to successfully using language (for a start, you knew that this was an appropriate place for such an expression).

    • Scott Alexander says:

      I’m trying to figure out where you disagree with me; can you think of any different predictions whatever theory you’re working off of would make from whatever theory you think I’m working off of?

      I agree most of our blending is much finer than the example above, but it happens at many different levels. For example, I almost never think on a letter-by-letter basis when I’m writing a post. I know a word like “word” and so I include it when it’s appropriate without considering that it’s made of the letters w-o-r-d. I know a phrase like “hoist by his own petard” and I use it without thinking about what a petard is. I know a sentence structure like “I think X” and I use it almost as a unit when it best fits the structure of my argument. And the structure of my argument is a vague combination of generic reductionism, the Chinese Room Experiment, and several other things I’ve seen.

      I think I’m using all of these things logically and appropriately in a way that conveys the right information. But I’m throwing together a few higher-level concepts like reductionism and Chinese Room (again, I think appropriately, with understanding of their meanings), I’m throwing together a lot of things I know about how to frame an argument, I’m filling each part of the frame with sentences whose structure comes from some kind of sentence-structure-generator I have that’s been trained on all the different sentences I’ve seen throughout my life, and then I’m filling those sentences in with specific words and turns of phrase based on some word-generator trained similarly.

      Do you think something else is going on beyond this?

      • Faza (TCM) says:

        The point is that you – as a human writer – start from a high-level model of what you’re trying to say and “fill in the blanks” from a repository of shared language utterances you picked up through communication with other humans.

        It is very much not a “slurry” of things that you’ve read, that you’ve subsequently “blended and spewed out”.

        The fact that you will be using utterances that other people have used is a given – we cannot communicate at all unless we’re using the same language in the same way. However, your aim is to produce a message that will be understood and you can verify this understanding by requesting a “read-back” using different words/phrases and checking whether what you understand the “read-back” to mean is the same thing you meant in your original message.

        It bears underscoring that the message should remain intact regardless of actual words/phrases used.

        In the broader context of your previous couple of posts, this one appears to be marshalling support for the proposition: “[Y]our mom is a brute-force statistical pattern matcher which blends up the internet and gives you back a slightly unappetizing slurry of it when asked.”

        It offers no such support. You could have communicated exactly the same information without using a phrase you copped from Miss Manners (which, incidentally, would neither count as plagiarism in most analasis – being a sufficiently common turn of phrase – nor does it capture the actual witticism of the original sentence – that being the “affection IS the entertainment” bit).

        You are not a blender, Scott.

        • Scott Alexander says:

          The point is that you – as a human writer – start from a high-level model of what you’re trying to say and “fill in the blanks” from a repository of shared language utterances you picked up through communication with other humans. It is very much not a “slurry” of things that you’ve read, that you’ve subsequently “blended and spewed out”.

          Yes, agreed, I start with a high level model. I’m not opposing “high-level model” and “slurry”. I’m saying that when you blend a bunch of particular high-level models, you get a vague skeletal high-level model which you can do anything with, which is what I’m calling “slurry”.

          Take eg one of those face generation AIs. It must have a high-level model of what a face is, eg two eyes + a mouth (which is also a slurry of all the faces it has ever seen). Then it “fills in” specific features like the eyes with a slurry created from all the eyes it’s ever seen (which is also a high-level model that eyes are made of pupil, iris, etc).

          I think GPT-2 is similar. When it writes a sentence, with a subject and predicate and so on, it’s starting with a highish-level model of what a sentence must be, then filling each part in. When it writes a version of Moloch, it’s starting with a high level model that it should be a bunch of sentences all starting with “Moloch”, then ending with an exclamation point. It’s obviously not great at this yet, but it’s trying.

          All that I’m arguing from my semi-plagiarism is that when I’m filling in my high-level model, I use things I’ve taken from elsewhere (which can also be high-level models of their own).

          • Faza (TCM) says:

            Do not mistake formal models for semantic models.

            Constructing a face from “two eyes, two ears, one nose, etc.” is starting from a formal model of what a face looks like. Similarly, if you start from a formal model of a sentence – such as “subject-verb-object” – and fill in the blanks, you’ll get a gramatically correct sentence.

            Foo bars baz.

            Ok, that’s a grammatically correct sentence. Why should we care if foo does indeed bar baz? Or not – if the statement is a lie.

            What GPT-2 doesn’t do – whilst you do, just like every other human writer – is construct a phase space of semantically equivalent sentences. Admittedly, we seldom do this consciously – at least unless we’re editing stuff we already wrote.

            Having seen further examples of stuff generated by the open source version, I’ve revised my appraisal of GPT-2 as “not much more powerful than a Markov generator”. Even the published, cherry-picked best examples become incoherent quite quickly – even on a sentence level:

            There is much agreement that it was essentially a war of slavery on behalf of capitalism, about a century of slavery.

            This sentence doesn’t make sense, unless we try very hard as readers.

            ETA: Actually, the very first sentence of the Civil War essay is no better:

            It is easy to identify why the Civil War happened, because so many people and so many books and so much television and films tell us that it was the cause, that it has something to do with race or economics or religion.

            It’s just far enough removed from pure gibberish that we can sort of make out a meaning there, but it is by no means a correct sentence that conveys a meaningful message.

            It does vaguely look like something a poorly performing student might write, but that’s because a poorly performing student doesn’t know much about the Civil War, doesn’t care much about the Civil War and probably couldn’t be bothered to learn how to communicate effectively in writing (many of the finest examples of poor writing I have seen share a common trait: attempting to use more complex modes of expression than have been mastered by the author or are appropriate to the subject matter; student essays could be improved considerably, I think, if we placed more emphasis on KISS).

            I’m willing to put dollars against peanuts, however, that a student capable of producing an comparable essay on the Civil War, will nevertheless be able to produce quite coherent, and possibly informative, verbal communications on subjects that they actually do care about.

          • albatross11 says:

            It’s interesting that when we talk about plagiarism, we normally mean only copying the words. There are a lot of cases where someone more-or-less rephrases something you’ve written, and it may p-ss you off but isn’t considered plagiarism. (Though if you publish a paper reusing someone else’s work without citing it, you may end up in some hot water.). Journalists get pretty mad about this happening (and it happens regularly)–journalist A writes a story about some topic, and later on, journalist B writes a story covering the same points without referencing journalist A’s story. I don’t think anyone considers this plagiarism.

            Mirroring a clever turn of phrase in someone else’s writing in your own words doesn’t seem anywhere close to plagiarism. It’s a common thing people do, part of how language works.

          • FeepingCreature says:

            I would agree that this is a knockdown argument more if I hadn’t been reading and listening to a Trump a bunch.

            I’m not saying that to insult the president, but we have a very high profile example of somebody who isn’t necessarily bad at speech but, let’s say, doesn’t see a need to do much filtering of his internal process before speaking.

            I couldn’t speak like Trump. Checking over the things I’m saying, have said and am planning to say is a deeply ingrained instinct. But I don’t doubt that the output of my consciousness before internal consistency checks sounds very Trump-like. And so does gpt-2! Its formalisms, repetitions, meandering around topics, and sometimes outright descent into nonsense are strikingly reminiscent of Trump’s stream-of-consciousness style of speech.

            gpt-2 can’t recurse. It can’t parse back its own speech, reintegrate it into a persistent model, and cross-check it for sanity. But that doesn’t mean it’s necessarily stupid or lacks any high-level structure, it just means that it’s missing a particular stage necessary to make human speech sound coherent. Ignoring this stage, the stuff it outputs doesn’t sound that much stupider from the stuff that “an average human brain without habitual consistency filtering”, ie. Trump Speech, outputs.

            You look at gpt-2 and think “Huh, the fact that this sounds incoherent implies it must be far below humans.” I look at gpt-2 and think “Huh, the fact that this sounds recognizeable implies that human speech must be way less involved than I thought.”

          • mcpalenik says:

            I would agree that this is a knockdown argument more if I hadn’t been reading and listening to a Trump a bunch.

            I think this writing is much less Trump like that a few people have indicated, mainly for the reasons that I tried to elaborate on in my comments on the previous thread. In short, I think there’s a very significant difference between the way GPT-2 and I would go about constructing a narrative.

            If I were asked to continue the Lord of the Rings fanfic, for example, my process would be something like:

            I know what characters and locations should be in this story, and I know the context for where it starts. I’d like to tell a story with a satisfying conclusions, so maybe I’ll say the goal of the character’s is to return the ring or whatever (as you might be able to tell, I didn’t enjoy the books or the movies all that much and don’t remember the details, but if I were really writing this, I’d research it). Knowing that the characters return the ring, I want to make the journey interesting, which means there should be some tension, danger, maybe some moments where it looks like the characters won’t complete their journey.

            And maybe I’d like to see the characters grow as well, and show emotions that make the reader empathize with them. I want the reader to feel like he’s in the character’s shoes, to fear when the character fears and be happy when the character is happy.

            I could draw on memory of similar stories, but I’d ask myself “how did I feel when I read that?” and “did I enjoy this part of the story?” Then, I would put this all together to fill in the bits of narrative that I had outlined in my head.

            I would have a model where characters are actual entities, where objects have permanence, and where there is a purpose behind everyone’s actions leading to an ultimate goal. I would then use my own thoughts and feeling to determine the most interesting way to tell this story.

            (I also think Trump is doing something roughly equivalent to this when he gives a speech, because that’s how humans operate)

            By contrast, while I think GPT-2 does certain things similarly, there are other things that it simply cannot be doing.

            It seems to be taking the prompt, matching it to a certain category, and perhaps extracting various names, places, and words typically associated with that category. This is much the same as my first step.

            Next, it seems to search through strings of text associated with those categories, and use them to construct some kind of semantic structure that should come next, with keys generated from the input inserted at certain places. And then something like this repeats for a while.

            There’s no goal to the writing, no point to the story, no real narrative. It just generates combinations of characters that maximize some number and spits the result out. There seems to be no intent to tell a story or real understanding of what objects actually are, which is why things happen like the number of rings changing and characters popping in and out of existence.

            It seems very easy to me that when this fails to replicate normal human writing and speech, to search for something similar that humans do and then decide it must be doing that. So, we might say “a lot of weird things happen in dreams” and “a lot of weird things happened in its Lord of the Rings fanfic” therefore it must be doing something like what humans do when they dream. But people tend to anthropomorphize the world around them (people do it with pets all the time, for example), and I wouldn’t be surprised if this is no different.

          • Ron says:

            I think there’s a good way to think about this. Well, at least in my head.

            On the one hand, yes, prediction is everything and a good enough predictor – one who perfectly predicts data at all levels – can be argued to be indistinguishable from Human. For example, predicting low level words, mid-level sentences, and high-level concepts (as expressed in text) should permit one to pass a written Turing test. So the current level of prediction is super cool in the sense that it can predict, to some degree, very high level concepts as expressed in text. And because we already manage to predict high level concepts, maybe we can get better by just scaling the current approach. When we get 100% prediction of high level concepts, we’re done.

            On the other hand, maybe it’s so ridiculously difficult that it’s not practical, and we’re using human intuition to extrapolate difficulty from solved to unsolved AI problems and implicitly replace “difficult to AI” with “difficult to me”. Sure, this is an old argument, but why is a very partial success in predicting high level concepts making this argument less valid?

          • FeepingCreature says:

            I know what characters and locations should be in this story, and I know the context for where it starts. I’d like to tell a story with a satisfying conclusions, so maybe I’ll say the goal of the character’s is to return the ring or whatever (as you might be able to tell, I didn’t enjoy the books or the movies all that much and don’t remember the details, but if I were really writing this, I’d research it).

            Right, that’s why I’m saying. That is the recursive readback and consistency filter. Don’t imagine you’re you trying to write a Lord of the Rings fic, imagine you are a twelve year old girl trying to write a Lord of the Rings fic. It has to have Legolas in it, because hnng Legolas, or maybe Gimli, I don’t judge, so somehow Legolas has to sensibly appear in a scene, and then maybe Legolas has to appear in the next scene, and maybe there’s some scenes where Legolas can’t appear because it’s blatantly implausible but this is a probabilistic process, not a rule-based deterministic one. We don’t iterate the story forward based on a ruleset, we have a feeling for what sort of thing should come after what other thing. – And the thing is, even your process works like that! When you’re working towards a satisfying conclusion, you’re applying a high-level model of what sort of thing should appear at the end of a story, how a narrative ought to flow; you have a concept of, maybe, a three act structure, the hero’s journey, etc. Those are all things that gpt-2 has not learnt and cannot learn, first because its scope of attention doesn’t span an entire book and second because it wasn’t fed books. But if it was, and if we threw a lot more compute and parameter size at it, I don’t see why it couldn’t pick up on what sort of patterns tend to appear at the start vs. the end of a story, how plot devices appear and reappear, etc. And sure, it couldn’t describe its process, but neither can many authors. And sure, it couldn’t reflect on its process, but neither do most people. And sure, there would be temporal inconsistencies and plot holes. But fanfics are practically famous for those anyways, and even published authors are not immune. I think competent fantasy storytelling, maybe lacking spark or imagination or innovation or true understanding but sort of … rote remixing, of the sort that people manage to successfully sell, is plausibly in range of a gpt-like architecture, and getting there mostly looks like the gpt-2 output slowly moving from “bad fanfic” (which does not seem to me that far beyond its capabilities) to “good fanfic” and beyond.

            In summary, you’re saying it’s bad because it mostly fails to hold temporal consistency or large pattern. I say it’s good because it holds any temporal consistency and applies patterns at some subset of scale. Getting it to do it at all seems to me like the hard part.

          • mcpalenik says:

            In summary, you’re saying it’s bad because it mostly fails to hold temporal consistency or large pattern. I say it’s good because it holds any temporal consistency and applies patterns at some subset of scale. Getting it to do it at all seems to me like the hard part.

            Not quite. I’m saying it’s bad because it only generates content based on recombining similar content with the goal of matching patterns within that content. When a person does something qualitatively similar, it’s for different purposes and with a multitude of other considerations, and can draw on additional things that the algorithm doesn’t even have a mechanism for interpreting.

            I think this is anything but a small difference and might actually turn out to mean that the two mechanisms of content generation are actually almost totally unrelated. Either way, I doubt it means a significant step toward human-like intelligence.

  4. Michael Arc says:

    Agreed with Faza here.

    I also want to suggest, in particular, that given that deep learning seems to particularly imitate what the Cortez does, it’s reasonable to suggest that remixing is specifically what the cortex does, but that this cortical activity is, in a human, integrated with deeper cognitive structure such as cerebellar motor abilities and hippocampal memory storage and retrieval.

    I also believe that it may be possible to get high performing from just the cortical part of motor control, once the cortex has been trained up by the cerebellum, but that the cerebellum is needed for the initial training.

    Related to this environmental processes which damage non-cortical cognitive function may produce only subtle effects on cognition. I hypothesize that the elite behaviors Robin Hanson describes here may be induced by this sort of sub-cortical damage reducing the constraints on elite behavior in a manner analogous to how reducing cerebellar function presumably deconstrains cortical motor learning.

  5. liskantope says:

    One of my main frustrations as a writer is that I’ve never felt able to settle into my own personal style. In writing fiction, I’m very good at imitating the narration style of someone I know well — for instance, I’ve written Harry Potter fanfic using a voice deliberately calculated to sound like J. K. Rowling and an Alice In Wonderland satire emulating the style of Lewis Carroll (and did well at both, I think). But part of the reason I’ve never taken up fiction-writing as a hobby in my adult life is a feeling of not knowing what my own voice is. As for nonfiction essays, my hobby of writing those started largely as a result of discovering Slate Star Codex, and I’m afraid my style has a strong flavor of Scott Alexander whether I want it to or not (not that that’s a bad flavor at all, but I can’t pull it off well in comparison to our host, and I’d just rather sound more genuinely like “myself”, whatever that is. More unfortunately, I’m afraid my style also has a faint whiff of “mathematical article”, which is fine for my professional papers — there’s not that much variance in style for mathematical exposition anyway — but the style seems very dry and stilted outside of the context of mathematical proofs.)

    • Kaj Sotala says:

      I think you just need to write more. People start by imitating the writing styles of people they like, then gradually start to deviate from that until they have their own voice.

      • fr8train_ssc says:

        This is what I’ve found as well. It goes back to making pots. I would add that there are two other important things that you need, one that the article mentions, and one that isn’t mentioned.

        The one thing mentioned in the article is Slack If you have spare time, you can afford to make mistakes. If you can’t afford to make mistakes, then you will not benefit from this approach.

        The other portion though, not mentioned in the article, is feedback. Practice doesn’t make perfect, so much as it makes permanent. If you’re good at critiquing yourself, this works well. Rationally, we should anticipate Dunning-Kreuger, which means one is unlikely to self-critique well unless they already have enough experience. In this case, having a peer group to critique with. This not only gives you more impartial feedback, but you’ll also develop your own feedback skills (assuming you’re taking the opportunity to critique others’ works) so that you’ll be able to step back and apply that to your own works.

  6. chaosmage says:

    I strongly recommend B.F. Skinner’s “Verbal Behavior”. He makes about the same point, and much more extensively, 60 years ago.

    Skinner is underrated, partly because he wrongly denied the existence of working memory but probably more because he has no need of the variable of the “self”.

    • RC-cola-and-a-moon-pie says:

      Yes. “Verbal Behavior” is an absolute classic, and presents an analytical framework that could so easily have been developed majestically. And the Chomsky review was so unfair! Skinner’s published notebooks also contain a great many anecdotes about Skinner’s own verbal behavior with hints how they could fit into his system of verbal behavior, kind of along the lines of this post.

      Edited to note the reservation that I would hesitate to characterize Skinner’s approach as “the same” as the one Scott is developing in his recent posts though!

    • kevin says:

      Skinner was done in by Martin Seligman’s experiment’s demonstrating “learned helplessness.”

      My favorite joke of the time was “Behaviorists can explain everything except the behavior of behaviorists.”

  7. onyomi says:

    I have had long had a similar suspicion about creativity: that “creativity” is just combining old pieces in ways no one else has combined them before, primarily by assembling a set of pieces no one has yet assembled in one mind (or team? but not sure this works) and then adequately mulled over before.

    In this respect, ideas seem to be like evolution, which happens faster in an ocean than an inland sea or remote mountain river. But this doesn’t mean “coelacanths” have nothing to teach us.

    • Faza (TCM) says:

      I wouldn’t be so sure about that: how does Conway’s Game of Life figure in all of this?

      To be less cryptic: you can generate new things by assuming a set of rules and seeing where this takes you. Those rules don’t actually have to refer to – or be based on – anything previously experienced; they simply have to be internally consistent.

      Such a rule-set can be described as a state machine where for any particular state there exists one and only one subsequent state. Conway’s Life fits the bill.

      The funny thing about Life is that whilst the rules are incredibly simple and the game is fully deterministic, the higher-level constructs that have been found cannot be reliably predicted from a mere knowledge of the rules. The only way to establish that Life is Turing-complete and that you can implement Tetris in it is to run it and see what happens.

      It’s not necessarily clear that Life is a “combination of old pieces in a way no one else has combined them before”. There are antecedents, of course, but one could create the game from whole cloth simply by specifying the possible local state transitions captured by the rules. More importantly, one can imagine any number of possible state machines that may produce results every bit as interesting as Life.

    • Loris says:

      In this respect, ideas seem to be like evolution,

      The meme of memes.

  8. Emperor Aristidus says:

    Again, I gotta agree with Faza. Obviously, the way in which you learn language per se is a lot like the way GPT-2 does. But GPT-2 is language running on autopilot, basically free-associating words. It does not, cannot understand the meaning of words.

  9. Murphy says:

    While I slightly disagree with your previous post re: how much of a step towards general AI it is, that system looks very very cool.

    I’d love to see it’s accuracy for extracting data from natural language text.

    If it has a lower error rate than bored undergrads dealing with the 300th page of text then it could be very very valuable.

    If the accuracy is good then there’s hundreds of millions of scanned pages of medical notes that could be read and summary data extracted into cleaner, analyzable format and that would make a lot of public health researchers salivate.

  10. indigo says:

    The most careful plagiarists change everything except the underlying concept, which they grasp at so deep a level that they can put it in whatever words they want – at which point it is no longer called plagiarism.

    “Restate the concept in your own words” is what they told us in high school. Then, in college, the professors tried to have us unlearn this and replace it with “cite your sources”. Failure to cite sources is also called plagiarism, although I think some institutions put it under the broader umbrella of “intellectual dishonesty”.

    I’m terrible about this. I absorb information like a sponge, good or bad, and fail to attach context to it. Many times I’ve told my spouse a cool new idea I’d heard somewhere, and it turns out she was the one who told me it in the first place. Worse, I don’t attach the authority of the source to the information, so I’m prone to picking up stuff from less reliable sources. I would have a pretty bad time as a research scientist trying to put together references.

  11. nameless1 says:

    >As discussed here previously, any predictive network doubles as a generative network.

    Only in a very limited sense. If you train it to recognize cute puppies i.e. tell them from everything else, and invert it, it will generate something like a very miminalist artwork of puppies, the basic minimum information that serves as the key to tell cute puppies from everything else, the basic minimum an artist would draw by 30 lines with a pencil that would still be a recognizable drawing of a cute puppy.

    At least if you have built it with resource conservation in mind, and the human brain is. Of course if it is more of a brute-force type of predictive network, then inverting it generates more information. I still doubt if it would generate photorealistic images of cute puppies, because even a very non-conservative network does need every tiny pixel detail to recognize a cute puppy so it will not store that data.

    Which means there is really something off with out memories and dreams. If our brains only stored the basic minimum information to recognize things, our memories would be like minimalist pencil sketches. Instead , our dreams – fueled by memories – are fully detailed as much as reality. This is wasteful for prediction and recognition.

    Which means our memories have purposes other than prediction and recognition.

    EDIT: actually not. I think you shared the link a while ago, that the face recognition of the human brain looks like the character generation in Skyrim or similar game in reverse. Using one slider to set the width of the chin, another to set the distance of the eyes etc. The brain stores these numbers, not the whole image. Which conserves resources.

    And then we meet someone we know in a dream and still see a full, perfectly detailed face. Which means there is a module in the brain that takes this numbers and generates the image.

    Which means, kind of, holy shit. Because you see the faces of my high school classmates are both more important and require less space to store than the whole building. And yet I can see that building in my dreams. So the building also was stored in a similar way – one number is the length of the corridor, another the color of the wall etc., not the whole bitmap.

    Which means not only the brain is not storing bitmaps it is even not storing stuff like how 3D graphics in a game is stored, the coordinates making up the shape, plus texture. I mean we already saw that for faces, stuff like “width of chin” is a far more abstract information than storing the coordinates of all the vectors making up the 3D model of the chin. Our face recognition does INTERPRETATION, it has concepts like eyes or chin.

    Which means even my high school building recognition, and thus generation, has concepts like corridor, door and floor, and stores only similarly minimal info.

    And from these abstractions the dream can generate a believable picture. Holy shit. This really means you cannot separate “seeing” from “interpreting”.

    When you argue with someone and you clearly see the world differently, it means you really *see* the world differently, it isn’t a metaphor. I mean, right at the moment of looking at something not. But next day what both of you have is memories of what you saw and these memories when recalled and generated from stored concepts like “eyes” or “corridors”, not raw visual data. And this in recalling it you literally see it differently if your conceptual interpretation was different. In recalling the memory, if you interpreted it differently you will get a different friggin’ visual image. NO SUCH THING AS REMEMBERING ACCURATELY AND OBJECTIVELY. All I can say is, holy shit.

  12. Freddie deBoer says:

    It happens to the best of us!

  13. fr8train_ssc says:

    If John Cleese is anything to go by, plagarism is indeed just pattern matching with incomplete input from the author. Once you find something specific to your scenario, or connect some disparate ideas to mix them as ingredients into the pot then you create something original. Sometimes the ideas themselves don’t have to be original, they just have to be said at the right time in the right way to be prophetic or accepted.

  14. Mustard Tiger says:

    I was pretty sure I invented the word “spork” in elementary school.

  15. Nornagest says:

    “Only be sure always to call it, please, ‘research’.”

  16. mcpalenik says:

    Has anyone else been thinking about the experiment described on this blog where they tried to find out if formerly blind people could identify shapes such as spheres and cubes after their sight was restored based on what they had felt with their hands when they were blind? The answer turned out to be no. I think that’s relevant when it comes to questions of building a model of language versus understanding the world.

  17. alwhite says:

    Is the word plagiarism being used here because of the humor of the context? It doesn’t seem like you can be accused of plagiarizing a turn of phrase.

  18. Edward Scizorhands says:

    Some people have better memories than others. People who are forgetful will lose the precise wording of something they read along ago. People with better memories will end up accidentally plagiarizing.

    At least, that’s my excuse.

  19. Rm says:

    …and some metres of poetry are so rare they seem to us to be monotypical and be limited to a single verse. And then someone else would write something in that rare metre and an internal critic would cry he cannot do it! this belongs to Hugo!

  20. Don P. says:

    Scott, this Miss Manners quote is the single quote of hers — from 30 years ago! — that I remember, and I must point that you didn’t extend the quote through to the excellent next line: “NOTE: at no point in this process is it acceptable to omit the food.” (Yes, I know, you wanted to end on the phrase you indicated.)

    Also, on the plagiarism question: surely it matters that you applied the concept “we no longer call it X” to a completely different topic from Miss Manners.

  21. Radu Floricica says:

    Which reminds me that the first reaction when I found ssc was “Awesome, thelastpsychiatrist started writing again!”

  22. kevin says:

    Pattern matching is fast thinking. That works for your assertion.

    Reasoning is slow thinking. Machine learning hasn’t been very successful at developing theories of mind for slow thinking. It’s a totally different problem. I suppose you could argue that it’s still pattern matching but metacognition is a difficult beast.

  23. Molehill says:

    I agree with your final paragraph, as long as you don’t view it as an insult.

    Analogous to your “Your mom…” line, it’s what everybody does:

    Just some do it better than others.

  24. Phil H says:

    I think I can operationalize the distinction that Faza is trying to draw above.

    The problem with what GPT-2 does is that it *only* draws on linguistic input, and it has *no* external checks on its linguistic output. For me, the mark of successful use of language is that it refers successfully to something non-linguistic. For example, if an AI can generate the sentence “Zebras are black and white,” but it can’t use that sentence to assess pictures of zebras or not-zebras, then it isn’t successfully using the words. It’s just making slurry.

    There are various standards of referring to which one could hold an AI. Some people have mentioned things like essences – you could say that an AI is only successfully using a word (or a phrase or a linguistic construction) when it knows the essence of the thing it refers to. Personally, I think that standard is too high. I’m not sure that people know essences, or how you would test whether an AI knows an essence.

    A second possible standard would be knowing everything about X that the average human user of a word knows. So for the word horse, the AI would have to know (by which I mean be able to accurately identify and maybe reproduce) what a horse looks like, sounds like, and maybe feels and smells like.

    The lowest standard, and the one that I favor, is any linking to any non-linguistic aspect of the thing referred to. In practical terms, that’s going to be pictures, because AIs have access to a lot of pictures, and pictures are very computable using our current technology. So if and when AIs can successfully translate between pictures and language, and reflect changes in pictures with changes in language and changes in language with changes in pictures, I will be willing to grant that they are using language successfully.

    Until that happens, I genuinely think that all of the incremental innovations in language algorithms are a waste of time (so far as teaching computers natural language is concerned – they may have brought other benefits).

    There are a couple of problems with my standard. (1) It may not be high enough: for most things in the world, pictures are just representations, and the words refer to the real things in the world. A computer that can use language successfully by my criterion may still not be able to talk properly, because it doesn’t know that M C Escher paintings are impossible. (2) My standard may be too high. If an AI managed to crunch text so successfully that just by processing the text on the internet, it was able to talk like a real person, my criteria would still say that it’s not using language successfully.

    But I think that this in-system computing versus out-of-system verification distinction is a helpful one when trying to think about how successful an AI’s language use is or is not.

    • Phil H says:

      And just to relate that to today’s post: I’m sure that Scott did indeed come up with his turn of phrase by mining it from the slurry of other people’s language. But he also checked to see if it was conceptually accurate, and indeed it was. It’s not just “the kind of thing that people write when they’re writing on this topic,” which is what GPT generates. It’s actually an accurate representation of the thoughts that he had.

      (Caveats: (1) I agree that kids probably often learn language in the way that GPT is working, and adults probably do it more than we’d like to admit. (2) It remains possible that Scott’s ideas are indeed just a slurry of the kind of ideas that people have when they’re thinking on this topic – but in that case the slurrying is happening on a different level: the conceptual level, not the linguistic level.)

  25. Yair says:

    I think it was Barthes that said that language/culture uses the author to create the text? I wish I could find the actual quote that says this because this post seems to suggest something similar.

  26. One of my most quoted lines is:

    The direct use of physical force is so poor a solution to the problem of limited resources that it is commonly employed only by small children and great nations.

    The original quote, not by me, was about the game of chicken.

  27. Nicholas Conrad says:

    Hey Scott, when I try to comment on the rip cw thread post I get a page not found error. Just wanted to say it’s easier for me to comment on posts when I disagree than when I agree, and I’m sorry if my negativity bias has effected you personally, since you mentioned you sometimes have a tough time with criticism. I really enjoy your blog and think you’re smart and funny, even when I also think you’re wrong about something in particular. I hope this doesn’t count as the kind of expression of sympathy that will make you feel worse, and I’m glad you found a solution to a problem that was causing you anxiety. Keep on keeping on.

  28. openendings says:

    I recommend, in the strongest possible terms, Syntax as Style by Virginia Tufte. (Yes, that Tufte.)
    It’s a “style manual” consisting entirely of interestingly shaped quotes the author filed away, spanning everything from newspaper articles to nineteenth century romance novels to Harry Potter. It’s the training corpus my plagiarizing brain never knew it needed 😛

  29. Garrett says:

    The entertainment/food/affection bit is probably the most actionable dating advice I’ve received in my whole life. Why couldn’t someone have told me this 20+ years ago?

  30. linkhyrule5 says:

    I have the exact same experience and have always wondered if people who I steal ideas/styles/worldviews from can tell :V.