Coffee Space


Listen:

Language Idea

Preview Image

Preview Image

I remember years ago a fellow student spoke to me about writing a new programming language as a project - and the truth is, whilst being a reasonably good engineer back then, there were a few problems:

  1. New idea - I can criticize languages easily (but not to any real depth), but I didn't really know what I was looking for that wasn't already out there.
  2. Ability - Whilst I could program reasonably well (enough to work on commercial projects), my ability since then is definitely improved.
  3. Effort - Writing a new language is a serious amount of effort. It's not something I feel I would really have the time for, even today.

The rest of this article is just a bit rambly about some random ideas I finally had a while back about something that be theoretically interesting.

Ideas

The following is a collection of ideas:

  1. No NULL - As Rust has done, the idea of a NULL value is just insane.
  2. Simplified pointers - As Java has mostly done, simplify pointers so they simply don't have to be thought about.
  3. Everything string - Everything should 'appear' as a printable string, with more robust data types being used in the background. A little like how anything in Python can be printed.
  4. C-style - I think people mostly like the C-style way of doing things, I certainly do. It's familiar and it works.
  5. Un-crashable - Ideally it should aim to be un-crashable. Sure, bad behaviour might happen, errors might occur, but it should not just crash (unless the kernel does). It should always return some kind of error that can be handled. The assumption that crashing is better than the program itself being able to handle the error seems fundamentally wrong.

Implementation

I think everything should be an object, and a class, with everything having functions and being usable as a string.

You would declare an object like so:

0001 class Object{
0002   /* You must always declare sensible defaults */
0003   private a = "default"; // Strings are always arrays
0004   private b = "20"; // Numbers are always strings
0005 
0006   /* Constructor for setting b (note, a constructor works on a clone) */
0007   public Object(c){
0008     b = c;
0009   }
0010 
0011   public doNothing(){}
0012 
0013   /* Get method - note lack of need to delare a return value */
0014   public getB(){ return b; }
0015 }

Constructors

The class Object itself is already an object and could be used as is, but the capital 'O' would indicate you are supposed to treat it like a class. This can be done with:

0016 print(Object.getB());
0017 obj = Object("10");
0018 print(obj.getB());

Firstly it would print {"20"}. We then clone Object, run the constructor and then store this data at obj. It then prints {"10"}. Note that it returns this data as a JSON representation, the default format for all objects 1. All JSON in this language would always be a string, introducing non-string data was always a mistake.

A blank constructor is always a default clone of an object. Therefore:

0019 obj = Object();
0020 print(obj == Object()); // True

Implementation note: Variable don't always have the JSON representation available until requested in the print or some other action that requires it, internally it would of course prefer to do math on integers.

Chaining

Unless a return statement is defined, by default functions return a reference to the object. For example, these are the same:

0021 print(Object.getB());
0022 print(Object.doNothing().getB());

Threading

All objects are atomic and locks are acquired to get and set values, including all deep objects in the structure. Each object is locked by both their own lock and all parents.

If one or more threads are editing the same object, it would be suggested to make a clone of it using the b = a() style. You could then edit and replace the original (but a race condition would still need to be handled).

Memory

If you access data that doesn't exist, it doesn't crash, it just returns a null placeholder. For example:

0023 array = [ "1", "2" ];
0024 print(array["0"]); // {"1"}
0025 print(array["1"]); // {"2"}
0026 print(array["2"]); // {}

null is always {}, and is also a valid object. As this type is blank, doesn't store data and has no methods, no memory is required to store it.

Data Types

For now I think there would be int, float and string. Strings in the language would be UTF-8 encoded as this now appears to be the standard. The user can only access strings, but can perform additional operations if the strings are of the right type (a lot like how Python handles such things).

int and float would offer as much precision as the platform allows for. This could be queries somehow, but ultimately we can only aim for so much platform independence. It could be nice to be able to still run this on a microcontroller for example.

Functions vs Variables

In this language there is no difference, functions and variables must not share names.

Variables are of course scoped and there is still the idea of public, private and static types.

There would be some reserved keywords as you would have in other languages, such as:

0027 // if and else is reserved
0028 if(a == b){
0029   /* Something */
0030 }else{
0031   /* Something else */
0032 }
0033 /* for is reserved */
0034 for(i = 0; i < 10; i++){
0035   /* Loop something */
0036 }
0037 x = 0
0038 /*  While is reserved*/
0039 while(x++ < 10){
0040   /* Also loop something */
0041 }
0042 /* switch, case, break, continue, etc are reserved */
0043 switch(x){
0044   case "0" :
0045     print("wow");
0046     break;
0047   default :
0048     print("nice");
0049     break;
0050 }

Reserved would mean that neither an object or variable could have these names. We would also likely want to reserve all keywords commonly used in other languages encase we want them in the future.

Note: When checking the equality of two objects, we just check that their JSON representations are equal (deeply).

Compiler

I would imagine this would be a largely reduced front-end for C++, where you could also include a C++ library using some intermediate wrapper that returns the correct types. This would likely be the fastest way to get something off the ground.

It wouldn't be particularly well optimised, especially in the beginning, as intelligent behaviour such as storing the data in a native format would be a challenge.

Objects would need to track a few things, and may initially look like the following:

0051 class Object{
0052   public type(){ return "string"; }
0053   public data(){ return "{}"; } // Note this would be built dynamically
0054 }

The data is the internal JSON representation and the type tells it how to manipulate the data. When it is more advanced it may also want to track things like when the JSON representation was last build, etc. Type could also be a parent class.

For the return of the data, all public and private variables for the object are recursively JSON-ified and returned. If there are not internal variables, there is no state, therefore we return null.

You would likely also want some kind of garbage collection system gets notified when memory is de-allocated so that it can go along and free() it. I imagine you're going to be doing a lot of malloc() with this system.

Conclusion

Microsoft discontinue the Visual J++ project from 2004 as a J# or C# implementation that was to gimmick Java. I don't think I am stepping on toes to call this J++, a mixture between C++ and JSON.

Will I implement this? Probably not. I think it could work though, especially with a blazingly fast JSON encoder and decoder available.


  1. JSON appears to generally be one of the best structure description formats with 'just enough' syntax to be powerful. YAML for example is not powerful enough, and XML is too powerful.