Parser API (wip but working good).

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Parser API (wip but working good).

sledorze
Hi!

Small post to tell I've started a parser Lib on my github account.
It's one file ATM:
https://github.com/sledorze/haxeExtensionBuilder/blob/master/src/com/mindrocks/text/Parser.hx

It can parse Json with the parser bellow.

I started it when I rethought about bringing a correct anon macro (supporting arbitrary code in values).

It's an eveneing job, so it does not handle, yet, left recursive grammars not report position with errors.
It follows a clean combinatorial approach however.

It would be awesome if it could generate bison like state machine base code via some code transformations without changing the API.

Anyone interested can fork.. or send requests.

Stephane



typedef JsEntry = { name : String, value : JsValue}
enum JsValue {
  JsObject(fields : Array<JsEntry>);
  JsArray(elements : Array<JsValue>);
  JsData(x : String);
}


class JsonParser {
 
  static function makeField(t : Tuple2<String, JsValue>) return
    { name : t.a, value : t.b }
 
  static var identifierR = ~/[a-zA-Z0-9_-]+/;
 
  static  var leftAccP = withSpacing("{".identifier());
  static  var rightAccP = withSpacing("}".identifier());
  static  var leftBracketP = withSpacing("[".identifier());
  static  var rightBracketP = withSpacing("]".identifier());
  static  var sepP = withSpacing(":".identifier());
  static  var commaP = withSpacing(",".identifier());
 
  static  var spaceP = " ".identifier();    
  static  var tabP = "\t".identifier();
  static  var retP = ("\r".identifier().or("\n".identifier()));
 
  static  function spacingP () return
    [
      spaceP.oneMany(),
      tabP.oneMany(),
      retP.oneMany()
    ].ors().many()()
 
  static function withSpacing<T>(p : Void -> Parser<T>) return
    spacingP._and(p)

  static var identifierP =
    withSpacing(identifierR.regexParser());

  static var jsonDataP =
    identifierP.then(JsData).lazyF();
   
  static var jsonArrayP =
    leftBracketP._and(jsonValueP.repsep(commaP)).and_(rightBracketP).then(JsArray).lazyF();
   
  static var jsonValueP : Void -> Parser<JsValue> =
    [jsonParser, jsonDataP, jsonArrayP].ors().lazyF();

  static var jsonEntryP =
    identifierP.and_(sepP).and(jsonValueP).lazyF();
 
  static  var jsonEntriesP =
    jsonEntryP.repsep(commaP).lazyF();

  public static var jsonParser =
    leftAccP._and(jsonEntriesP).and_(rightAccP).then(function (entries)
      return JsObject(entries.map(makeField).array())
    ).lazyF();
}
Reply | Threaded
Open this post in threaded view
|

Re: Parser API (wip but working good).

Justin Donaldson-3
There's a few more json parsers:
http://code.google.com/p/thx-core/source/browse/src/thx/json/JsonDecoder.hx
http://lib.haxe.org/p/hxJSON

I'm using the first one (Franco's).

Your approach looks different, so maybe the others are just useful to compare in terms of speed of decoding/encoding.

-Justin



On Sun, Nov 6, 2011 at 4:16 AM, sledorze <[hidden email]> wrote:
Hi!

Small post to tell I've started a parser Lib on my github account.
It's one file ATM:
https://github.com/sledorze/haxeExtensionBuilder/blob/master/src/com/mindrocks/text/Parser.hx

It can parse Json with the parser bellow.

I started it when I rethought about bringing a correct anon macro
(supporting arbitrary code in values).

It's an eveneing job, so it does not handle, yet, left recursive grammars
not report position with errors.
It follows a clean combinatorial approach however.

It would be awesome if it could generate bison like state machine base code
via some code transformations without changing the API.

Anyone interested can fork.. or send requests.

Stephane



typedef JsEntry = { name : String, value : JsValue}
enum JsValue {
 JsObject(fields : Array<JsEntry>);
 JsArray(elements : Array<JsValue>);
 JsData(x : String);
}


class JsonParser {

 static function makeField(t : Tuple2<String, JsValue>) return
   { name : t.a, value : t.b }

 static var identifierR = ~/[a-zA-Z0-9_-]+/;

 static  var leftAccP = withSpacing("{".identifier());
 static  var rightAccP = withSpacing("}".identifier());
 static  var leftBracketP = withSpacing("[".identifier());
 static  var rightBracketP = withSpacing("]".identifier());
 static  var sepP = withSpacing(":".identifier());
 static  var commaP = withSpacing(",".identifier());

 static  var spaceP = " ".identifier();
 static  var tabP = "\t".identifier();
 static  var retP = ("\r".identifier().or("\n".identifier()));

 static  function spacingP () return
   [
     spaceP.oneMany(),
     tabP.oneMany(),
     retP.oneMany()
   ].ors().many()()

 static function withSpacing<T>(p : Void -> Parser<T>) return
   spacingP._and(p)

 static var identifierP =
   withSpacing(identifierR.regexParser());

 static var jsonDataP =
   identifierP.then(JsData).lazyF();

 static var jsonArrayP =

leftBracketP._and(jsonValueP.repsep(commaP)).and_(rightBracketP).then(JsArray).lazyF();

 static var jsonValueP : Void -> Parser<JsValue> =
   [jsonParser, jsonDataP, jsonArrayP].ors().lazyF();

 static var jsonEntryP =
   identifierP.and_(sepP).and(jsonValueP).lazyF();

 static  var jsonEntriesP =
   jsonEntryP.repsep(commaP).lazyF();

 public static var jsonParser =
   leftAccP._and(jsonEntriesP).and_(rightAccP).then(function (entries)
     return JsObject(entries.map(makeField).array())
   ).lazyF();
}


--
View this message in context: http://haxe.1354130.n2.nabble.com/Parser-API-wip-but-working-good-tp6967273p6967273.html
Sent from the Haxe mailing list archive at Nabble.com.

--
haXe - an open source web programming language
http://haxe.org


--
haXe - an open source web programming language
http://haxe.org
Reply | Threaded
Open this post in threaded view
|

Re: Parser API (wip but working good).

sledorze
:)
the important part is not the Json parser; (its just an example -  and mine must be way slower actually and maybe not complete / wrong) but the general parser api it is built upon.

Stéphane

On Sun, Nov 6, 2011 at 5:20 PM, Justin Donaldson-3 [via Haxe] <[hidden email]> wrote:
There's a few more json parsers:
http://code.google.com/p/thx-core/source/browse/src/thx/json/JsonDecoder.hx
http://lib.haxe.org/p/hxJSON

I'm using the first one (Franco's).

Your approach looks different, so maybe the others are just useful to compare in terms of speed of decoding/encoding.

-Justin



On Sun, Nov 6, 2011 at 4:16 AM, sledorze <[hidden email]> wrote:
Hi!

Small post to tell I've started a parser Lib on my github account.
It's one file ATM:
https://github.com/sledorze/haxeExtensionBuilder/blob/master/src/com/mindrocks/text/Parser.hx

It can parse Json with the parser bellow.

I started it when I rethought about bringing a correct anon macro
(supporting arbitrary code in values).

It's an eveneing job, so it does not handle, yet, left recursive grammars
not report position with errors.
It follows a clean combinatorial approach however.

It would be awesome if it could generate bison like state machine base code
via some code transformations without changing the API.

Anyone interested can fork.. or send requests.

Stephane



typedef JsEntry = { name : String, value : JsValue}
enum JsValue {
 JsObject(fields : Array<JsEntry>);
 JsArray(elements : Array<JsValue>);
 JsData(x : String);
}


class JsonParser {

 static function makeField(t : Tuple2<String, JsValue>) return
   { name : t.a, value : t.b }

 static var identifierR = ~/[a-zA-Z0-9_-]+/;

 static  var leftAccP = withSpacing("{".identifier());
 static  var rightAccP = withSpacing("}".identifier());
 static  var leftBracketP = withSpacing("[".identifier());
 static  var rightBracketP = withSpacing("]".identifier());
 static  var sepP = withSpacing(":".identifier());
 static  var commaP = withSpacing(",".identifier());

 static  var spaceP = " ".identifier();
 static  var tabP = "\t".identifier();
 static  var retP = ("\r".identifier().or("\n".identifier()));

 static  function spacingP () return
   [
     spaceP.oneMany(),
     tabP.oneMany(),
     retP.oneMany()
   ].ors().many()()

 static function withSpacing<T>(p : Void -> Parser<T>) return
   spacingP._and(p)

 static var identifierP =
   withSpacing(identifierR.regexParser());

 static var jsonDataP =
   identifierP.then(JsData).lazyF();

 static var jsonArrayP =

leftBracketP._and(jsonValueP.repsep(commaP)).and_(rightBracketP).then(JsArray).lazyF();

 static var jsonValueP : Void -> Parser<JsValue> =
   [jsonParser, jsonDataP, jsonArrayP].ors().lazyF();

 static var jsonEntryP =
   identifierP.and_(sepP).and(jsonValueP).lazyF();

 static  var jsonEntriesP =
   jsonEntryP.repsep(commaP).lazyF();

 public static var jsonParser =
   leftAccP._and(jsonEntriesP).and_(rightAccP).then(function (entries)
     return JsObject(entries.map(makeField).array())
   ).lazyF();
}


--
View this message in context: http://haxe.1354130.n2.nabble.com/Parser-API-wip-but-working-good-tp6967273p6967273.html
Sent from the Haxe mailing list archive at Nabble.com.

--
haXe - an open source web programming language
http://haxe.org


--
haXe - an open source web programming language
http://haxe.org


If you reply to this email, your message will be added to the discussion below:
http://haxe.1354130.n2.nabble.com/Parser-API-wip-but-working-good-tp6967273p6968204.html
To unsubscribe from Parser API (wip but working good)., click here.



--
Stéphane Le Dorze


Reply | Threaded
Open this post in threaded view
|

Re: Parser API (wip but working good).

Franco Ponticelli
Very interesting ...
@sledorze, does the class name Functionnal have the "n" doubled intentionally?

Franco 

On Sun, Nov 6, 2011 at 9:23 AM, sledorze <[hidden email]> wrote:
:)
the important part is not the Json parser; (its just an example -  and mine must be way slower actually and maybe not complete / wrong) but the general parser api it is built upon.

Stéphane

On Sun, Nov 6, 2011 at 5:20 PM, Justin Donaldson-3 [via Haxe] <[hidden email]> wrote:
There's a few more json parsers:
http://code.google.com/p/thx-core/source/browse/src/thx/json/JsonDecoder.hx
http://lib.haxe.org/p/hxJSON

I'm using the first one (Franco's).

Your approach looks different, so maybe the others are just useful to compare in terms of speed of decoding/encoding.

-Justin



On Sun, Nov 6, 2011 at 4:16 AM, sledorze <[hidden email]> wrote:
Hi!

Small post to tell I've started a parser Lib on my github account.
It's one file ATM:
https://github.com/sledorze/haxeExtensionBuilder/blob/master/src/com/mindrocks/text/Parser.hx

It can parse Json with the parser bellow.

I started it when I rethought about bringing a correct anon macro
(supporting arbitrary code in values).

It's an eveneing job, so it does not handle, yet, left recursive grammars
not report position with errors.
It follows a clean combinatorial approach however.

It would be awesome if it could generate bison like state machine base code
via some code transformations without changing the API.

Anyone interested can fork.. or send requests.

Stephane



typedef JsEntry = { name : String, value : JsValue}
enum JsValue {
 JsObject(fields : Array<JsEntry>);
 JsArray(elements : Array<JsValue>);
 JsData(x : String);
}


class JsonParser {

 static function makeField(t : Tuple2<String, JsValue>) return
   { name : t.a, value : t.b }

 static var identifierR = ~/[a-zA-Z0-9_-]+/;

 static  var leftAccP = withSpacing("{".identifier());
 static  var rightAccP = withSpacing("}".identifier());
 static  var leftBracketP = withSpacing("[".identifier());
 static  var rightBracketP = withSpacing("]".identifier());
 static  var sepP = withSpacing(":".identifier());
 static  var commaP = withSpacing(",".identifier());

 static  var spaceP = " ".identifier();
 static  var tabP = "\t".identifier();
 static  var retP = ("\r".identifier().or("\n".identifier()));

 static  function spacingP () return
   [
     spaceP.oneMany(),
     tabP.oneMany(),
     retP.oneMany()
   ].ors().many()()

 static function withSpacing<T>(p : Void -> Parser<T>) return
   spacingP._and(p)

 static var identifierP =
   withSpacing(identifierR.regexParser());

 static var jsonDataP =
   identifierP.then(JsData).lazyF();

 static var jsonArrayP =

leftBracketP._and(jsonValueP.repsep(commaP)).and_(rightBracketP).then(JsArray).lazyF();

 static var jsonValueP : Void -> Parser<JsValue> =
   [jsonParser, jsonDataP, jsonArrayP].ors().lazyF();

 static var jsonEntryP =
   identifierP.and_(sepP).and(jsonValueP).lazyF();

 static  var jsonEntriesP =
   jsonEntryP.repsep(commaP).lazyF();

 public static var jsonParser =
   leftAccP._and(jsonEntriesP).and_(rightAccP).then(function (entries)
     return JsObject(entries.map(makeField).array())
   ).lazyF();
}


--
View this message in context: http://haxe.1354130.n2.nabble.com/Parser-API-wip-but-working-good-tp6967273p6967273.html
Sent from the Haxe mailing list archive at Nabble.com.

--
haXe - an open source web programming language
http://haxe.org


--
haXe - an open source web programming language
http://haxe.org


If you reply to this email, your message will be added to the discussion below:
http://haxe.1354130.n2.nabble.com/Parser-API-wip-but-working-good-tp6967273p6968204.html
To unsubscribe from Parser API (wip but working good)., click here.



--
Stéphane Le Dorze


Tel: <a href="tel:%2B33%20%280%29%206%2008%20%C2%A076%2070%2015" value="+33608767015" target="_blank">+33 (0) 6 08  76 70 15




View this message in context: Re: Parser API (wip but working good).

Sent from the Haxe mailing list archive at Nabble.com.

--
haXe - an open source web programming language
http://haxe.org


--
haXe - an open source web programming language
http://haxe.org
Reply | Threaded
Open this post in threaded view
|

Re: Parser API (wip but working good).

sledorze
no.. :p

warning; this repo does not have a lot of love..
It's the garbage I throw out stuff I do on evenings..

I deciced it would be nice for me to have them in one place and be able to enhance them on free time, on need.

Thanks for reporting this Bug! ;)
Reply | Threaded
Open this post in threaded view
|

Re: Parser API (wip but working good).

sledorze
In reply to this post by Franco Ponticelli
What you be interesting (after handling left recursion and better error reporting)
would be to generate an AST of the grammar, fuse what can be fused and generate in place optimized code via compiler macro.

Honestly..  :)
Reply | Threaded
Open this post in threaded view
|

Re: Parser API (wip but working good).

sledorze
I've decided to go one step further;

I've added Error reporting and prevented the recursive generation of parsers instances when executing some rules.

More importantly; I've also integrated a memo function to lift a parser to a packrat parser, enabling:

Direct, indirect and mutual left recursion.
Backtracking and unlimited lookahead.
Generally linear parsing time.

Paper here: http://www.vpri.org/pdf/tr2007002_packrat.pdf

The API itself has not changed for the user.
This would bring us, after more testing, a general cross platform parsing API.
(FYI: Memo remove the need for lexers)

I will push it in the next week (depends on free time) after giving it some more love and some clean-up.

Stephane
Reply | Threaded
Open this post in threaded view
|

Re: Parser API (wip but working good).

laurence taylor
Darn, you beat me to it. Looking forward to seeing it!

On Sun, Nov 13, 2011 at 11:17 PM, sledorze <[hidden email]> wrote:
I've decided to go one step further;

I've added Error reporting and prevented the recursive generation of parsers
instances when executing some rules.

More importantly; I've also integrated a memo function to lift a parser to a
packrat parser, enabling:

Direct, indirect and mutual left recursion.
Backtracking and unlimited lookahead.
Generally linear parsing time.

Paper here: http://www.vpri.org/pdf/tr2007002_packrat.pdf

The API itself has not changed for the user.
This would bring us, after more testing, a general cross platform parsing
API.
(FYI: Memo remove the need for lexers)

I will push it in the next week (depends on free time) after giving it some
more love and some clean-up.

Stephane


--
View this message in context: http://haxe.1354130.n2.nabble.com/Parser-API-wip-but-working-good-tp6967273p6990898.html
Sent from the Haxe mailing list archive at Nabble.com.

--
haXe - an open source web programming language
http://haxe.org


--
haXe - an open source web programming language
http://haxe.org