Csound Csound-dev Csound-tekno Search About

[Cs-dev] Parser3

Date2014-08-21 02:42
FromSteven Yi
Subject[Cs-dev] Parser3
Hi All,

Just FYI, I have started a new project for exploring grammar changes
for Csound at:

https://github.com/kunstmusik/parser3

I had done some analysis of the current parser and decided it was too
hard for me to experiment with language changes quickly with the
current parser. Instead, I started a fresh grammar specifically for
experimenting with the language, without having to worry about the
actual compilation, line numbering, error handling, and all the other
things that the real parser does in Csound.

The parser so far is just a start.  In general my working method right
now is to add one new line of ORC code to csound.orc, have the parser
fail, then go and implement the rules in the grammar and lexer. (Sort
of following the general Test-Driven Design practice of writing a
failing case first then implementing to fix the failed case.)

To note: the idea with Parser3 is to remove a lot of the
semantic-aware things from the current parser out.  This includes
removing rules like rident, bexpr, opcode, and opcode0 as these things
shouldn't be necessary to distinguish at parse-time.  (Instead, they
should be processed during the semantic analysis phase.)  Also, some
things may end up actually being moved into the lexer.  (For example,
I've been looking at the Java Language Specification for some
inspiration (http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.12),
and thought to make new lexical tokens for OPERATOR and
ASSIGNMENT_OPERATOR).

Hopefully the end result will be a smaller, simpler grammar. When we
first started putting together the current bison grammar for Csound 5,
we ran into some issues that made Csound ORC difficult to parse.
However, I'm optimistic at this point that we should be able to solve
these issues more directly now that we have the type system in Csound
6 and a more certain semantic analysis phase.

I'll be focusing my time on this as my primary Csound dev work for
now. Any comments and feedback on the grammar as it develops is
appreciated.

Thanks!
steven

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Csound-devel mailing list
Csound-devel@lists.sourceforge.net

Date2014-08-21 22:12
Fromjoachim heintz
SubjectRe: [Cs-dev] Parser3
sounds fantastic, steven. please give a hint here if there is anything 
to test or to give sime feedback. all best -
	joachim


Am 21.08.2014 um 03:42 schrieb Steven Yi:
> Hi All,
>
> Just FYI, I have started a new project for exploring grammar changes
> for Csound at:
>
> https://github.com/kunstmusik/parser3
>
> I had done some analysis of the current parser and decided it was too
> hard for me to experiment with language changes quickly with the
> current parser. Instead, I started a fresh grammar specifically for
> experimenting with the language, without having to worry about the
> actual compilation, line numbering, error handling, and all the other
> things that the real parser does in Csound.
>
> The parser so far is just a start.  In general my working method right
> now is to add one new line of ORC code to csound.orc, have the parser
> fail, then go and implement the rules in the grammar and lexer. (Sort
> of following the general Test-Driven Design practice of writing a
> failing case first then implementing to fix the failed case.)
>
> To note: the idea with Parser3 is to remove a lot of the
> semantic-aware things from the current parser out.  This includes
> removing rules like rident, bexpr, opcode, and opcode0 as these things
> shouldn't be necessary to distinguish at parse-time.  (Instead, they
> should be processed during the semantic analysis phase.)  Also, some
> things may end up actually being moved into the lexer.  (For example,
> I've been looking at the Java Language Specification for some
> inspiration (http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.12),
> and thought to make new lexical tokens for OPERATOR and
> ASSIGNMENT_OPERATOR).
>
> Hopefully the end result will be a smaller, simpler grammar. When we
> first started putting together the current bison grammar for Csound 5,
> we ran into some issues that made Csound ORC difficult to parse.
> However, I'm optimistic at this point that we should be able to solve
> these issues more directly now that we have the type system in Csound
> 6 and a more certain semantic analysis phase.
>
> I'll be focusing my time on this as my primary Csound dev work for
> now. Any comments and feedback on the grammar as it develops is
> appreciated.
>
> Thanks!
> steven
>
> ------------------------------------------------------------------------------
> Slashdot TV.
> Video for Nerds.  Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> Csound-devel mailing list
> Csound-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/csound-devel
>

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Csound-devel mailing list
Csound-devel@lists.sourceforge.net

Date2014-08-22 17:43
FromSteven Yi
SubjectRe: [Cs-dev] Parser3
Thanks Joachim! I think there won't be much to test until I start
pushing this stuff into Csound itself.

Actually, I was able to move pretty quickly with the bare grammar.  I
think I'm actually at a point where I could start implementing this in
Csound.

To note, the current grammar is at:

https://github.com/kunstmusik/parser3/blob/master/src/csound_orc.y

Right now, that grammar has one shift/reduce issue and it's not
critical as far as I can tell (taking the shift seems to do the job
correctly). The really tricky part is that supporting opcode call
statements has a bit of an overlap with assignment/function call
statements.  With the above grammar, I've figured that it will match
some situations a little bit awkwardly, but on the other hand, those
can be dealt with by the semantic analyzer.

For example, things like these:

dosomething (4)
dosomething (4), 5, 6

all get matched as opcode calls.  Most of them will be caught by one rule:

        | out_arg_list expr_list NEWLINE

So parsing the above we'd get an out_arg_list with one element called
dosomething, and then expr_list with the parts starting with (4).
That's odd in terms of language, but grammatically it's alright in
this scenario as we can add rules at semantic analysis time to
disambiguate what is going on.  (i.e. we can check if the out_arg_list
has only one item, if so, it becomes a candidate for an opcode call if
the word matches an opcode name, etc.)

I think if the semantic rules are setup correctly, this should
properly handle the issues with the space between an opcode name and
().  (Note: if you go through the assignment statement path, the
ambiguity of () is gone as we match through the function_call rule).

At this point, I'm feeling somewhat good about the state of this
grammar.  I'll plan to start a new branch and start experimenting with
incorporating the changes into the real Csound code.


On Thu, Aug 21, 2014 at 5:12 PM, joachim heintz  wrote:
> sounds fantastic, steven. please give a hint here if there is anything
> to test or to give sime feedback. all best -
>         joachim
>
>
> Am 21.08.2014 um 03:42 schrieb Steven Yi:
>> Hi All,
>>
>> Just FYI, I have started a new project for exploring grammar changes
>> for Csound at:
>>
>> https://github.com/kunstmusik/parser3
>>
>> I had done some analysis of the current parser and decided it was too
>> hard for me to experiment with language changes quickly with the
>> current parser. Instead, I started a fresh grammar specifically for
>> experimenting with the language, without having to worry about the
>> actual compilation, line numbering, error handling, and all the other
>> things that the real parser does in Csound.
>>
>> The parser so far is just a start.  In general my working method right
>> now is to add one new line of ORC code to csound.orc, have the parser
>> fail, then go and implement the rules in the grammar and lexer. (Sort
>> of following the general Test-Driven Design practice of writing a
>> failing case first then implementing to fix the failed case.)
>>
>> To note: the idea with Parser3 is to remove a lot of the
>> semantic-aware things from the current parser out.  This includes
>> removing rules like rident, bexpr, opcode, and opcode0 as these things
>> shouldn't be necessary to distinguish at parse-time.  (Instead, they
>> should be processed during the semantic analysis phase.)  Also, some
>> things may end up actually being moved into the lexer.  (For example,
>> I've been looking at the Java Language Specification for some
>> inspiration (http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.12),
>> and thought to make new lexical tokens for OPERATOR and
>> ASSIGNMENT_OPERATOR).
>>
>> Hopefully the end result will be a smaller, simpler grammar. When we
>> first started putting together the current bison grammar for Csound 5,
>> we ran into some issues that made Csound ORC difficult to parse.
>> However, I'm optimistic at this point that we should be able to solve
>> these issues more directly now that we have the type system in Csound
>> 6 and a more certain semantic analysis phase.
>>
>> I'll be focusing my time on this as my primary Csound dev work for
>> now. Any comments and feedback on the grammar as it develops is
>> appreciated.
>>
>> Thanks!
>> steven
>>
>> ------------------------------------------------------------------------------
>> Slashdot TV.
>> Video for Nerds.  Stuff that matters.
>> http://tv.slashdot.org/
>> _______________________________________________
>> Csound-devel mailing list
>> Csound-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/csound-devel
>>
>
> ------------------------------------------------------------------------------
> Slashdot TV.
> Video for Nerds.  Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> Csound-devel mailing list
> Csound-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/csound-devel

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Csound-devel mailing list
Csound-devel@lists.sourceforge.net

Date2014-08-22 18:07
FromFelipe Sateler
SubjectRe: [Cs-dev] Parser3
I don't know anything about this, but... how is a return argument list
followed by an expression list an opcode call? I'm confused. I'm not
sure what sort of expressions that is meant to capture.

PS: there are some unused lexer and parser tokens like ZERODBFS and STRING

On Fri, Aug 22, 2014 at 12:43 PM, Steven Yi  wrote:
> Thanks Joachim! I think there won't be much to test until I start
> pushing this stuff into Csound itself.
>
> Actually, I was able to move pretty quickly with the bare grammar.  I
> think I'm actually at a point where I could start implementing this in
> Csound.
>
> To note, the current grammar is at:
>
> https://github.com/kunstmusik/parser3/blob/master/src/csound_orc.y
>
> Right now, that grammar has one shift/reduce issue and it's not
> critical as far as I can tell (taking the shift seems to do the job
> correctly). The really tricky part is that supporting opcode call
> statements has a bit of an overlap with assignment/function call
> statements.  With the above grammar, I've figured that it will match
> some situations a little bit awkwardly, but on the other hand, those
> can be dealt with by the semantic analyzer.
>
> For example, things like these:
>
> dosomething (4)
> dosomething (4), 5, 6
>
> all get matched as opcode calls.  Most of them will be caught by one rule:
>
>         | out_arg_list expr_list NEWLINE
>
> So parsing the above we'd get an out_arg_list with one element called
> dosomething, and then expr_list with the parts starting with (4).
> That's odd in terms of language, but grammatically it's alright in
> this scenario as we can add rules at semantic analysis time to
> disambiguate what is going on.  (i.e. we can check if the out_arg_list
> has only one item, if so, it becomes a candidate for an opcode call if
> the word matches an opcode name, etc.)
>
> I think if the semantic rules are setup correctly, this should
> properly handle the issues with the space between an opcode name and
> ().  (Note: if you go through the assignment statement path, the
> ambiguity of () is gone as we match through the function_call rule).
>
> At this point, I'm feeling somewhat good about the state of this
> grammar.  I'll plan to start a new branch and start experimenting with
> incorporating the changes into the real Csound code.
>
>
> On Thu, Aug 21, 2014 at 5:12 PM, joachim heintz  wrote:
>> sounds fantastic, steven. please give a hint here if there is anything
>> to test or to give sime feedback. all best -
>>         joachim
>>
>>
>> Am 21.08.2014 um 03:42 schrieb Steven Yi:
>>> Hi All,
>>>
>>> Just FYI, I have started a new project for exploring grammar changes
>>> for Csound at:
>>>
>>> https://github.com/kunstmusik/parser3
>>>
>>> I had done some analysis of the current parser and decided it was too
>>> hard for me to experiment with language changes quickly with the
>>> current parser. Instead, I started a fresh grammar specifically for
>>> experimenting with the language, without having to worry about the
>>> actual compilation, line numbering, error handling, and all the other
>>> things that the real parser does in Csound.
>>>
>>> The parser so far is just a start.  In general my working method right
>>> now is to add one new line of ORC code to csound.orc, have the parser
>>> fail, then go and implement the rules in the grammar and lexer. (Sort
>>> of following the general Test-Driven Design practice of writing a
>>> failing case first then implementing to fix the failed case.)
>>>
>>> To note: the idea with Parser3 is to remove a lot of the
>>> semantic-aware things from the current parser out.  This includes
>>> removing rules like rident, bexpr, opcode, and opcode0 as these things
>>> shouldn't be necessary to distinguish at parse-time.  (Instead, they
>>> should be processed during the semantic analysis phase.)  Also, some
>>> things may end up actually being moved into the lexer.  (For example,
>>> I've been looking at the Java Language Specification for some
>>> inspiration (http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.12),
>>> and thought to make new lexical tokens for OPERATOR and
>>> ASSIGNMENT_OPERATOR).
>>>
>>> Hopefully the end result will be a smaller, simpler grammar. When we
>>> first started putting together the current bison grammar for Csound 5,
>>> we ran into some issues that made Csound ORC difficult to parse.
>>> However, I'm optimistic at this point that we should be able to solve
>>> these issues more directly now that we have the type system in Csound
>>> 6 and a more certain semantic analysis phase.
>>>
>>> I'll be focusing my time on this as my primary Csound dev work for
>>> now. Any comments and feedback on the grammar as it develops is
>>> appreciated.
>>>
>>> Thanks!
>>> steven
>>>
>>> ------------------------------------------------------------------------------
>>> Slashdot TV.
>>> Video for Nerds.  Stuff that matters.
>>> http://tv.slashdot.org/
>>> _______________________________________________
>>> Csound-devel mailing list
>>> Csound-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/csound-devel
>>>
>>
>> ------------------------------------------------------------------------------
>> Slashdot TV.
>> Video for Nerds.  Stuff that matters.
>> http://tv.slashdot.org/
>> _______________________________________________
>> Csound-devel mailing list
>> Csound-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/csound-devel
>
> ------------------------------------------------------------------------------
> Slashdot TV.
> Video for Nerds.  Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> Csound-devel mailing list
> Csound-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/csound-devel



-- 

Saludos,
Felipe Sateler

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Csound-devel mailing list
Csound-devel@lists.sourceforge.net

Date2014-08-22 18:50
FromSteven Yi
SubjectRe: [Cs-dev] Parser3
Yes, it's confusing. :)  It has to do with whitespace being
insignificant. So if you have:

a = functionName (2)

That gets parsed as a statement, with T_IDENT '=' function_call.
That's what you'd expect.  However, what happens with the opcall rule
is that we have an ambiguity:

functionName (2)

If we try to use function_call as a base statement, we get into a
parsing problem.  The parser sees the above as:

T_IDENT '(' INTEGER_TOKEN ')'

Now, it can actually then read that as two things, one being a
function_call, the other being an out_arg and an expression (the
expression being '(' INTEGER_TOKEN ')').

The use of "out_arg_list expr_list" actually catches all scenarios for
opcode calls where there is two parts, i.e.:

asig in
asig, asig1  in
out asig
out asig, asig2, asig3

However, it will also catch:

asig, asig  in, in, in
out, asig  asig2, asig3

and think it's valid.  That's alright though, as we will have rules in
the semantic analyzer to rule out the invalid cases. Also, something
like:

out asig (3)

will actually be caught by the third rule:

out_arg_list  T_IDENT expr_list NEWLINE

because the (3) will be identified as an expression.

Note, Csound's opcode syntax and use of expressions is by nature
ambiguous.  Right now the current parser addresses that by having some
semantic information early on before parsing.  This just flips that to
say, this is acceptable at parse time, but we'll sort it out later at
semantic analysis time.

On Fri, Aug 22, 2014 at 1:07 PM, Felipe Sateler  wrote:
> I don't know anything about this, but... how is a return argument list
> followed by an expression list an opcode call? I'm confused. I'm not
> sure what sort of expressions that is meant to capture.
>
> PS: there are some unused lexer and parser tokens like ZERODBFS and STRING
>
> On Fri, Aug 22, 2014 at 12:43 PM, Steven Yi  wrote:
>> Thanks Joachim! I think there won't be much to test until I start
>> pushing this stuff into Csound itself.
>>
>> Actually, I was able to move pretty quickly with the bare grammar.  I
>> think I'm actually at a point where I could start implementing this in
>> Csound.
>>
>> To note, the current grammar is at:
>>
>> https://github.com/kunstmusik/parser3/blob/master/src/csound_orc.y
>>
>> Right now, that grammar has one shift/reduce issue and it's not
>> critical as far as I can tell (taking the shift seems to do the job
>> correctly). The really tricky part is that supporting opcode call
>> statements has a bit of an overlap with assignment/function call
>> statements.  With the above grammar, I've figured that it will match
>> some situations a little bit awkwardly, but on the other hand, those
>> can be dealt with by the semantic analyzer.
>>
>> For example, things like these:
>>
>> dosomething (4)
>> dosomething (4), 5, 6
>>
>> all get matched as opcode calls.  Most of them will be caught by one rule:
>>
>>         | out_arg_list expr_list NEWLINE
>>
>> So parsing the above we'd get an out_arg_list with one element called
>> dosomething, and then expr_list with the parts starting with (4).
>> That's odd in terms of language, but grammatically it's alright in
>> this scenario as we can add rules at semantic analysis time to
>> disambiguate what is going on.  (i.e. we can check if the out_arg_list
>> has only one item, if so, it becomes a candidate for an opcode call if
>> the word matches an opcode name, etc.)
>>
>> I think if the semantic rules are setup correctly, this should
>> properly handle the issues with the space between an opcode name and
>> ().  (Note: if you go through the assignment statement path, the
>> ambiguity of () is gone as we match through the function_call rule).
>>
>> At this point, I'm feeling somewhat good about the state of this
>> grammar.  I'll plan to start a new branch and start experimenting with
>> incorporating the changes into the real Csound code.
>>
>>
>> On Thu, Aug 21, 2014 at 5:12 PM, joachim heintz  wrote:
>>> sounds fantastic, steven. please give a hint here if there is anything
>>> to test or to give sime feedback. all best -
>>>         joachim
>>>
>>>
>>> Am 21.08.2014 um 03:42 schrieb Steven Yi:
>>>> Hi All,
>>>>
>>>> Just FYI, I have started a new project for exploring grammar changes
>>>> for Csound at:
>>>>
>>>> https://github.com/kunstmusik/parser3
>>>>
>>>> I had done some analysis of the current parser and decided it was too
>>>> hard for me to experiment with language changes quickly with the
>>>> current parser. Instead, I started a fresh grammar specifically for
>>>> experimenting with the language, without having to worry about the
>>>> actual compilation, line numbering, error handling, and all the other
>>>> things that the real parser does in Csound.
>>>>
>>>> The parser so far is just a start.  In general my working method right
>>>> now is to add one new line of ORC code to csound.orc, have the parser
>>>> fail, then go and implement the rules in the grammar and lexer. (Sort
>>>> of following the general Test-Driven Design practice of writing a
>>>> failing case first then implementing to fix the failed case.)
>>>>
>>>> To note: the idea with Parser3 is to remove a lot of the
>>>> semantic-aware things from the current parser out.  This includes
>>>> removing rules like rident, bexpr, opcode, and opcode0 as these things
>>>> shouldn't be necessary to distinguish at parse-time.  (Instead, they
>>>> should be processed during the semantic analysis phase.)  Also, some
>>>> things may end up actually being moved into the lexer.  (For example,
>>>> I've been looking at the Java Language Specification for some
>>>> inspiration (http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.12),
>>>> and thought to make new lexical tokens for OPERATOR and
>>>> ASSIGNMENT_OPERATOR).
>>>>
>>>> Hopefully the end result will be a smaller, simpler grammar. When we
>>>> first started putting together the current bison grammar for Csound 5,
>>>> we ran into some issues that made Csound ORC difficult to parse.
>>>> However, I'm optimistic at this point that we should be able to solve
>>>> these issues more directly now that we have the type system in Csound
>>>> 6 and a more certain semantic analysis phase.
>>>>
>>>> I'll be focusing my time on this as my primary Csound dev work for
>>>> now. Any comments and feedback on the grammar as it develops is
>>>> appreciated.
>>>>
>>>> Thanks!
>>>> steven
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Slashdot TV.
>>>> Video for Nerds.  Stuff that matters.
>>>> http://tv.slashdot.org/
>>>> _______________________________________________
>>>> Csound-devel mailing list
>>>> Csound-devel@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/csound-devel
>>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Slashdot TV.
>>> Video for Nerds.  Stuff that matters.
>>> http://tv.slashdot.org/
>>> _______________________________________________
>>> Csound-devel mailing list
>>> Csound-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/csound-devel
>>
>> ------------------------------------------------------------------------------
>> Slashdot TV.
>> Video for Nerds.  Stuff that matters.
>> http://tv.slashdot.org/
>> _______________________________________________
>> Csound-devel mailing list
>> Csound-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/csound-devel
>
>
>
> --
>
> Saludos,
> Felipe Sateler
>
> ------------------------------------------------------------------------------
> Slashdot TV.
> Video for Nerds.  Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> Csound-devel mailing list
> Csound-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/csound-devel

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Csound-devel mailing list
Csound-devel@lists.sourceforge.net

Date2014-08-22 21:22
FromFelipe Sateler
SubjectRe: [Cs-dev] Parser3
On Fri, Aug 22, 2014 at 1:50 PM, Steven Yi  wrote:
> Yes, it's confusing. :)  It has to do with whitespace being
> insignificant. So if you have:
>
> a = functionName (2)
>
> That gets parsed as a statement, with T_IDENT '=' function_call.
> That's what you'd expect.  However, what happens with the opcall rule
> is that we have an ambiguity:
>
> functionName (2)
>
> If we try to use function_call as a base statement, we get into a
> parsing problem.  The parser sees the above as:
>
> T_IDENT '(' INTEGER_TOKEN ')'
>
> Now, it can actually then read that as two things, one being a
> function_call, the other being an out_arg and an expression (the
> expression being '(' INTEGER_TOKEN ')').
>
> The use of "out_arg_list expr_list" actually catches all scenarios for
> opcode calls where there is two parts, i.e.:
>
> asig in
> asig, asig1  in
> out asig
> out asig, asig2, asig3
>
> However, it will also catch:
>
> asig, asig  in, in, in
> out, asig  asig2, asig3

Shouldn't this rule be instead:

| T_IDENT expr_list NEWLINE
| out_arg_list T_IDENT NEWLINE

?

That should avoid catching the last 2 examples.

Again, I don't know much about parsers. But it seems to me that the
grammar should describe what is expected: all those examples actually
want a T_IDENT, and now you are "cheating" by pretending the T_IDENT
is an expr (which you can get away with because expr includes
T_IDENT).

But you probably tried this and it didn't work. How so?

>
> and think it's valid.  That's alright though, as we will have rules in
> the semantic analyzer to rule out the invalid cases. Also, something
> like:
>
> out asig (3)
>
> will actually be caught by the third rule:
>
> out_arg_list  T_IDENT expr_list NEWLINE
>
> because the (3) will be identified as an expression.
>
> Note, Csound's opcode syntax and use of expressions is by nature
> ambiguous.  Right now the current parser addresses that by having some
> semantic information early on before parsing.  This just flips that to
> say, this is acceptable at parse time, but we'll sort it out later at
> semantic analysis time.

Hmm OK, I think I now see the ambiguity. And this of course makes
writing a parser more difficult. These parser things go way over my
head :p


-- 

Saludos,
Felipe Sateler

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Csound-devel mailing list
Csound-devel@lists.sourceforge.net

Date2014-08-22 22:06
FromSteven Yi
SubjectRe: [Cs-dev] Parser3
Yes, it is ambiguous.  I had already tried:

T_IDENT expr_list NEWLINE
| out_arg_list T_IDENT NEWLINE

and what happens is the parser doesn't know when it has an initial
T_IDENT whether to reduce it as an out_arg_list or to shift and try to
match the next value as an expr_list.  This is because T_IDENT is a
possible token that would match both expr and out_arg, and both of
those are used for expr_list and out_arg_list respectively.

But yes, this is a workaround and not ideal, but not much to be done
considering Csound's language.

On Fri, Aug 22, 2014 at 4:22 PM, Felipe Sateler  wrote:
> On Fri, Aug 22, 2014 at 1:50 PM, Steven Yi  wrote:
>> Yes, it's confusing. :)  It has to do with whitespace being
>> insignificant. So if you have:
>>
>> a = functionName (2)
>>
>> That gets parsed as a statement, with T_IDENT '=' function_call.
>> That's what you'd expect.  However, what happens with the opcall rule
>> is that we have an ambiguity:
>>
>> functionName (2)
>>
>> If we try to use function_call as a base statement, we get into a
>> parsing problem.  The parser sees the above as:
>>
>> T_IDENT '(' INTEGER_TOKEN ')'
>>
>> Now, it can actually then read that as two things, one being a
>> function_call, the other being an out_arg and an expression (the
>> expression being '(' INTEGER_TOKEN ')').
>>
>> The use of "out_arg_list expr_list" actually catches all scenarios for
>> opcode calls where there is two parts, i.e.:
>>
>> asig in
>> asig, asig1  in
>> out asig
>> out asig, asig2, asig3
>>
>> However, it will also catch:
>>
>> asig, asig  in, in, in
>> out, asig  asig2, asig3
>
> Shouldn't this rule be instead:
>
> | T_IDENT expr_list NEWLINE
> | out_arg_list T_IDENT NEWLINE
>
> ?
>
> That should avoid catching the last 2 examples.
>
> Again, I don't know much about parsers. But it seems to me that the
> grammar should describe what is expected: all those examples actually
> want a T_IDENT, and now you are "cheating" by pretending the T_IDENT
> is an expr (which you can get away with because expr includes
> T_IDENT).
>
> But you probably tried this and it didn't work. How so?
>
>>
>> and think it's valid.  That's alright though, as we will have rules in
>> the semantic analyzer to rule out the invalid cases. Also, something
>> like:
>>
>> out asig (3)
>>
>> will actually be caught by the third rule:
>>
>> out_arg_list  T_IDENT expr_list NEWLINE
>>
>> because the (3) will be identified as an expression.
>>
>> Note, Csound's opcode syntax and use of expressions is by nature
>> ambiguous.  Right now the current parser addresses that by having some
>> semantic information early on before parsing.  This just flips that to
>> say, this is acceptable at parse time, but we'll sort it out later at
>> semantic analysis time.
>
> Hmm OK, I think I now see the ambiguity. And this of course makes
> writing a parser more difficult. These parser things go way over my
> head :p
>
>
> --
>
> Saludos,
> Felipe Sateler
>
> ------------------------------------------------------------------------------
> Slashdot TV.
> Video for Nerds.  Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> Csound-devel mailing list
> Csound-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/csound-devel

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Csound-devel mailing list
Csound-devel@lists.sourceforge.net

Date2014-08-22 22:33
FromFelipe Sateler
SubjectRe: [Cs-dev] Parser3
Ah, that makes sense.

Thanks for explaining!

On Fri, Aug 22, 2014 at 5:06 PM, Steven Yi  wrote:
> Yes, it is ambiguous.  I had already tried:
>
> T_IDENT expr_list NEWLINE
> | out_arg_list T_IDENT NEWLINE
>
> and what happens is the parser doesn't know when it has an initial
> T_IDENT whether to reduce it as an out_arg_list or to shift and try to
> match the next value as an expr_list.  This is because T_IDENT is a
> possible token that would match both expr and out_arg, and both of
> those are used for expr_list and out_arg_list respectively.
>
> But yes, this is a workaround and not ideal, but not much to be done
> considering Csound's language.
>
> On Fri, Aug 22, 2014 at 4:22 PM, Felipe Sateler  wrote:
>> On Fri, Aug 22, 2014 at 1:50 PM, Steven Yi  wrote:
>>> Yes, it's confusing. :)  It has to do with whitespace being
>>> insignificant. So if you have:
>>>
>>> a = functionName (2)
>>>
>>> That gets parsed as a statement, with T_IDENT '=' function_call.
>>> That's what you'd expect.  However, what happens with the opcall rule
>>> is that we have an ambiguity:
>>>
>>> functionName (2)
>>>
>>> If we try to use function_call as a base statement, we get into a
>>> parsing problem.  The parser sees the above as:
>>>
>>> T_IDENT '(' INTEGER_TOKEN ')'
>>>
>>> Now, it can actually then read that as two things, one being a
>>> function_call, the other being an out_arg and an expression (the
>>> expression being '(' INTEGER_TOKEN ')').
>>>
>>> The use of "out_arg_list expr_list" actually catches all scenarios for
>>> opcode calls where there is two parts, i.e.:
>>>
>>> asig in
>>> asig, asig1  in
>>> out asig
>>> out asig, asig2, asig3
>>>
>>> However, it will also catch:
>>>
>>> asig, asig  in, in, in
>>> out, asig  asig2, asig3
>>
>> Shouldn't this rule be instead:
>>
>> | T_IDENT expr_list NEWLINE
>> | out_arg_list T_IDENT NEWLINE
>>
>> ?
>>
>> That should avoid catching the last 2 examples.
>>
>> Again, I don't know much about parsers. But it seems to me that the
>> grammar should describe what is expected: all those examples actually
>> want a T_IDENT, and now you are "cheating" by pretending the T_IDENT
>> is an expr (which you can get away with because expr includes
>> T_IDENT).
>>
>> But you probably tried this and it didn't work. How so?
>>
>>>
>>> and think it's valid.  That's alright though, as we will have rules in
>>> the semantic analyzer to rule out the invalid cases. Also, something
>>> like:
>>>
>>> out asig (3)
>>>
>>> will actually be caught by the third rule:
>>>
>>> out_arg_list  T_IDENT expr_list NEWLINE
>>>
>>> because the (3) will be identified as an expression.
>>>
>>> Note, Csound's opcode syntax and use of expressions is by nature
>>> ambiguous.  Right now the current parser addresses that by having some
>>> semantic information early on before parsing.  This just flips that to
>>> say, this is acceptable at parse time, but we'll sort it out later at
>>> semantic analysis time.
>>
>> Hmm OK, I think I now see the ambiguity. And this of course makes
>> writing a parser more difficult. These parser things go way over my
>> head :p
>>
>>
>> --
>>
>> Saludos,
>> Felipe Sateler
>>
>> ------------------------------------------------------------------------------
>> Slashdot TV.
>> Video for Nerds.  Stuff that matters.
>> http://tv.slashdot.org/
>> _______________________________________________
>> Csound-devel mailing list
>> Csound-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/csound-devel
>
> ------------------------------------------------------------------------------
> Slashdot TV.
> Video for Nerds.  Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> Csound-devel mailing list
> Csound-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/csound-devel



-- 

Saludos,
Felipe Sateler

------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Csound-devel mailing list
Csound-devel@lists.sourceforge.net