[Cuis] Why or why not OMeta? (was Re: Brainstorming question: what non-trivial uses can you think of for an object-based parser? (strings not invited))

Sat May 23 07:15:47 CDT 2015

Hi Phil,

thanks for your input. I have a few things to add, which are in summary 
that OMeta concise examples are just parsing grammar productions as they 
could appear elsewhere (bison/SmaCC)... Certainly beats hand written 
parsing :)

Another part of the OMeta use is a kind of code rewritter / generator, 
and I wonder about the usefullness of yet another layer of rewriting (as 
in aspects / template-based meta programming)...

Le 22/05/2015 21:49, Phil (list) a écrit :
> Hi Thierry,
>
> On Fri, 2015-05-22 at 09:56 +0200, Thierry Goubier wrote:
>> Hi all,
>>
>>
>> first post here about Cuis, and this is a question I am interested
>> in... I do believe the viewpoint institutes documents have a few
>> answers about that (parsers for network protocols, etc...). But
>> still...
>>
>
> I could see network protocols as another time-series application.

I'm not in the know about that, but I would guess from past discussions 
that people dealing with network and communication protocols would like 
to see automatons... and parsing is a way to build an automaton.

>> I'm in a strange position about OMeta which is I don't see the
>> benefits. I do have the same position about PetitParser, but with even
>> worse data points which is I know precisely the performance loss of
>> going the petit parser way.
>>
>
> Not strange at all given where it sounds like you're coming from re:
> performance being a key requirement.  I won't try to put any spin on it:
> everything I've seen indicates that OMeta is among the slowest parsers
> out there, but pretty quick given its approach.  Computing power what it
> is today, for many applications the response is 'it's fast enough' or
> 'who cares?' (see the World Wide Web, client- and server-side, for a
> perfect example)  I would imagine that if you have heavy data processing
> workloads or have very specific response time requirements, then you do
> care and OMeta wouldn't work for the application.  However, as a
> language for DSLs, at most you're typically only going to see a small
> fraction of a second of overhead.  Another way to think of it: if speed
> OF the solution is the priority, don't use OMeta.  If speed TO the
> solution is the priority, that's what OMeta does well.  I'll get more
> specific below...

>
>>
>> I have been writing compiler front-ends for the past 7 years, first
>> with Flex / Bison and C, and then with Smalltalk / SmaCC (I maintain
>> SmaCC for Pharo). I see the work done by John Brant and Don Roberts
>> first hand (RB, SmaCC, generalised refactoring in SmaCC) and I know
>> that both OMeta and petit parser are using for me what is a very
>> limited form of parsing, with additionally a large performance
>> penalty. Moreover, grammars produced in the PetitParser case are as
>> long, if not longer than the equivalent SmaCC grammar.
>>
>
> I believe this is one of the areas where OMeta is quite strong: its
> grammars are short... very short... 'where did the grammar go?' short.
> Consider this example I posted earlier to parse Squeak array
> constructors.  Here is the Smalltalk version (i.e. what OMeta is
> actually doing behind the scenes):
>
> arrayConstr
>          ^ self ometaOr: {[true
>                          ifTrue: [self apply: #token withArgs: {'{'}.
>                                  self apply: #expr.
>                                  self
>                                          many: [true
>                                                          ifTrue: [self
> apply: #token withArgs: {'.'}.
>                                                                  self
> apply: #expr]].
>                                  self ometaOr: {[self apply: #token
> withArgs: {'.'}]. [self apply:
> #empty]}.
>                                  self apply: #token withArgs: {'}'}]].
> [true
>                          ifTrue: [self apply: #token withArgs: {'{'}.
>                                  self apply: #token withArgs: {'}'}]]}
>
> and here's the OMeta version:
>
> arrayConstr =
>
>          "{" expr ("." expr)* ("." | empty) "}"
> |       "{" "}"

But this is just as would be the SmaCC grammar for that... This is what 
I added to the SmaCC St grammar to support arrays:

	| "{" StatementList OptionalPeriod "}"
			{ RBArrayNode statements: '2' }
	| "{" "}"
			{ RBArrayNode new }

This is my hardest thing with OMeta: grammars are, well, grammars. No 
difference. PEGs are strictly less interesting than GLR (and probably 
even than LALR...).

Object orientation over grammars is interesting, may bring benefits (had 
to work on an extension of C in the past), but why PEGs?

> The only thing that's missing are any semantic predicates and actions so
> the ultimate size and readability will be more dictated by how much
> Smalltalk code it takes to actually do the work with what OMeta has
> parsed.

 From my experience, this part vastly dwarf any parsing related 
complexity :( I'm doing things for R at the moment, a lot slower than I 
expected because what is behind the parser is very complex.

>> So what are the benefits of OMeta? Note that SmaCC would very easily
>> do parsing over any kind of objects, not only tokens.
>>
>
> I understand that OMeta isn't unique in being an object parser and I
> started this thread mainly because I'm wondering how much value people
> can see in parsing things other than text/binary streams.  i.e. is it a
> genuinely useful feature or a gimmick/freebie that won't see much use?

I do think that OMeta has some unique benefits... but which?

> As to the first part of your question, here goes:  The fundamental
> concept that really grabs me is the OMeta approach of being written in
> the host language and using source to source translation to target the
> host language while essentially hijacking the host language and
> environment to fade into the background of the host environment.  Want a
> new DSL?  Subclass OMeta2 and add methods with your rules... done.  Want
> a new dialect of said DSL?  Subclass your first DSL and tweak as needed.
> Want to write a program in your DSL?  Create a new class and setup the
> compiler for that class to use your parser as its 'Language'.  For
> example, I could create a subclass called Lisp and write every method in
> that class as either pure Lisp or as a hyrid of Lisp/Smalltalk/and any
> other DSLs I had created, provided I set up the parsing correctly.  I'm
> not aware of any other parser that does it quite so elegantly.

You're probably right on that, and it could bring interesting uses.

> Now here are the downsides:  Alex, the original author or OMeta, is a
> parser / languages guy.  This work was related to his employment at VPRI
> and subsequent PhD work.  He's since moved on to other things and
> there's still a lot missing from OMeta on Smalltalk in terms of tooling
> to actually realize the vision.  The lack of debugging support will
> drive you nuts until you get used to what it's telling you:  have a
> syntax error in your rules? '<-- parse error around here -->'... have
> fun! A semantic error in your parser? Get used to looking at your
> decompiled code (i.e. the actual Smalltalk it generates) when things go
> wrong to figure it out.  Have a logic/runtime error (i.e. your generated
> code is sending a message to nil)?  Ditto re: looking at the decompiled
> code when it crashes while running.  When everything is correct and
> working, OMeta is pure joy.  When it isn't, welcome back to 1980's style
> debugging.  Also, if you have an ambiguous grammar look elsewhere...
> OMeta won't work for you.  Finally, as I mentioned at the top, OMeta
> isn't going to set any new parser speed records.

I believe some of this is related to DSL support / modeling inside tools 
(debugging), and isn't specific to OMeta really (IMHO).

> Hope this helps,

Thanks,

Thierry