Having programmed for many years in many languages, I often find myself thinking in a kind of natural language pseudo-code, then translating it into whatever I'm working with at that time. So one day I thought, Why not simply code at a natural language level and skip the translation step? I talked it over with my elder son, also a programmer, and we decided to test the theory. Specificially, we wanted to know:
1. Is it easier to program when you don’t have to translate your natural-language thoughts into an alternate syntax?
2. Can natural languages be parsed in a relatively “sloppy” manner (as humans apparently parse them) and still provide a stable enough environment for productive programming?
3. Can low-level programs (like compilers) be conveniently and efficiently written in high level languages (like English)?
And so we set about developing a "plain english" compiler in the interest of answering those questions. And we are happy to report that we can now answer each of those three questions, from direct experience, with a resounding “Yes!” Here are some details:
Our parser operates, we think, something like the parsing centers in the human brain. Consider, for example, a father saying to his baby son:
“Want to suck on this bottle, little guy?”
And the kid hears,
“blah, blah, SUCK, blah, blah, BOTTLE, blah, blah.”
But he properly responds because he’s got a “picture” of a bottle in the right side of his head connected to the word “bottle” on the left side, and a pre-existing “skill” near the back of his neck connected to the term “suck”. In other words, the kid matches what he can with the pictures (types) and skills (routines) he’s accumulated, and simply disregards the rest. Our compiler does very much the same thing, with new pictures (types) and skills (routines) being defined -- not by us, but -- by the programmer, as he writes new application code.
A typical type definition looks like this:
A polygon is a thing with some vertices.
Internally, the name “polygon” is now associated with a type of dynamically-allocated structure that contains a doubly-linked list of vertices. “Vertex” is defined elsewhere (before or after this definition) in a similar fashion; the plural is automatically understood.
A typical routine looks like this:
To append an x coord and a y coord to a polygon:
Create a vertex given the x and the y.
Append the vertex to the polygon’s vertices.
Note that formal names (proper nouns) are not required for parameters and variables. This, we believe, is a major insight. My real-world chair and table are never (in normal conversation) called “c” or “myTable” -- I refer to them simply as “the chair” and “the table”. Likewise here: “the vertex” and “the polygon” are the natural names for such things.
Note also that spaces are allowed in routine and variable “names” (like “x coord”. This is the 21st century, yes? And that “nicknames” are also allowed (such as “x” for “x coord”. And that possessives (“polygon’s vertices” are used in a very natural way to reference “fields” within “records”.
Note, as well, that the word “given” could have been “using” or “with” or any other equivalent since our sloppy parsing focuses on the pictures (types) and skills (routines) needed for understanding, and ignores, as much as possible, the rest.
At the lowest level, things look like this:
To add a number to another number:
Intel $8B85080000008B008B9D0C0000000103.
Note that in this case we have both the highest and lowest of languages -- English and machine code (in hexadecimal) -- in a single routine. The insight here is that (like a typical math book) a program should be written primarily in a natural language, with appropriate snippets in more convenient syntaxes as (and only as) required.
We hope someday soon to extend the technology to include Plain Spanish, and Plain French, and Plain German, etc.
Anyway, if you're interested, you can download the whole thing here: url redacted. It’s a small Windows program, less than a megabyte in size. No installation necessary; just unzip and execute. But it's a complete development environment, including a unique interface, a simplified file manager, an elegant text editor, a handy hexadecimal dumper, a native-code-generating compiler/linker, and even a wysiwyg page layout facility (that we used to produce the documentation). If you start with the "instructions.pdf" in the “documentation” directory, before you go ten pages you won't just be writing "Hello, World!" to the screen: you’ll be re-compiling the whole shebang in itself (in less than three seconds on a bottom-of-the-line machine from Walmart).
Take a look. Then let us know what you think here on the forum.
Thanks,
Gerry Rzeppa
Grand Negus of the Osmosian Order of Plain English Programmers
Dan Rzeppa
Prime Assembler of the Osmosian Order of Plain English Programmers
Sorry, I can't allow a link to a .zip in this context. Interesting first post, but if you want to give people access to the project then either post a link to a page of your own where they can read about the contents of the file and download it, or upload the source to github or bitbucket and promote it from there.
Markbnj
Programming mod
1. Is it easier to program when you don’t have to translate your natural-language thoughts into an alternate syntax?
2. Can natural languages be parsed in a relatively “sloppy” manner (as humans apparently parse them) and still provide a stable enough environment for productive programming?
3. Can low-level programs (like compilers) be conveniently and efficiently written in high level languages (like English)?
And so we set about developing a "plain english" compiler in the interest of answering those questions. And we are happy to report that we can now answer each of those three questions, from direct experience, with a resounding “Yes!” Here are some details:
Our parser operates, we think, something like the parsing centers in the human brain. Consider, for example, a father saying to his baby son:
“Want to suck on this bottle, little guy?”
And the kid hears,
“blah, blah, SUCK, blah, blah, BOTTLE, blah, blah.”
But he properly responds because he’s got a “picture” of a bottle in the right side of his head connected to the word “bottle” on the left side, and a pre-existing “skill” near the back of his neck connected to the term “suck”. In other words, the kid matches what he can with the pictures (types) and skills (routines) he’s accumulated, and simply disregards the rest. Our compiler does very much the same thing, with new pictures (types) and skills (routines) being defined -- not by us, but -- by the programmer, as he writes new application code.
A typical type definition looks like this:
A polygon is a thing with some vertices.
Internally, the name “polygon” is now associated with a type of dynamically-allocated structure that contains a doubly-linked list of vertices. “Vertex” is defined elsewhere (before or after this definition) in a similar fashion; the plural is automatically understood.
A typical routine looks like this:
To append an x coord and a y coord to a polygon:
Create a vertex given the x and the y.
Append the vertex to the polygon’s vertices.
Note that formal names (proper nouns) are not required for parameters and variables. This, we believe, is a major insight. My real-world chair and table are never (in normal conversation) called “c” or “myTable” -- I refer to them simply as “the chair” and “the table”. Likewise here: “the vertex” and “the polygon” are the natural names for such things.
Note also that spaces are allowed in routine and variable “names” (like “x coord”. This is the 21st century, yes? And that “nicknames” are also allowed (such as “x” for “x coord”. And that possessives (“polygon’s vertices” are used in a very natural way to reference “fields” within “records”.
Note, as well, that the word “given” could have been “using” or “with” or any other equivalent since our sloppy parsing focuses on the pictures (types) and skills (routines) needed for understanding, and ignores, as much as possible, the rest.
At the lowest level, things look like this:
To add a number to another number:
Intel $8B85080000008B008B9D0C0000000103.
Note that in this case we have both the highest and lowest of languages -- English and machine code (in hexadecimal) -- in a single routine. The insight here is that (like a typical math book) a program should be written primarily in a natural language, with appropriate snippets in more convenient syntaxes as (and only as) required.
We hope someday soon to extend the technology to include Plain Spanish, and Plain French, and Plain German, etc.
Anyway, if you're interested, you can download the whole thing here: url redacted. It’s a small Windows program, less than a megabyte in size. No installation necessary; just unzip and execute. But it's a complete development environment, including a unique interface, a simplified file manager, an elegant text editor, a handy hexadecimal dumper, a native-code-generating compiler/linker, and even a wysiwyg page layout facility (that we used to produce the documentation). If you start with the "instructions.pdf" in the “documentation” directory, before you go ten pages you won't just be writing "Hello, World!" to the screen: you’ll be re-compiling the whole shebang in itself (in less than three seconds on a bottom-of-the-line machine from Walmart).
Take a look. Then let us know what you think here on the forum.
Thanks,
Gerry Rzeppa
Grand Negus of the Osmosian Order of Plain English Programmers
Dan Rzeppa
Prime Assembler of the Osmosian Order of Plain English Programmers
Sorry, I can't allow a link to a .zip in this context. Interesting first post, but if you want to give people access to the project then either post a link to a page of your own where they can read about the contents of the file and download it, or upload the source to github or bitbucket and promote it from there.
Markbnj
Programming mod
Last edited by a moderator: