[Cs-dev] Parser3

[Cs-dev] Parser3

Date	2014-08-21 02:42
From	Steven Yi
Subject	[Cs-dev] Parser3
	Hi All, Just FYI, I have started a new project for exploring grammar changes for Csound at: https://github.com/kunstmusik/parser3 I had done some analysis of the current parser and decided it was too hard for me to experiment with language changes quickly with the current parser. Instead, I started a fresh grammar specifically for experimenting with the language, without having to worry about the actual compilation, line numbering, error handling, and all the other things that the real parser does in Csound. The parser so far is just a start. In general my working method right now is to add one new line of ORC code to csound.orc, have the parser fail, then go and implement the rules in the grammar and lexer. (Sort of following the general Test-Driven Design practice of writing a failing case first then implementing to fix the failed case.) To note: the idea with Parser3 is to remove a lot of the semantic-aware things from the current parser out. This includes removing rules like rident, bexpr, opcode, and opcode0 as these things shouldn't be necessary to distinguish at parse-time. (Instead, they should be processed during the semantic analysis phase.) Also, some things may end up actually being moved into the lexer. (For example, I've been looking at the Java Language Specification for some inspiration (http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.12), and thought to make new lexical tokens for OPERATOR and ASSIGNMENT_OPERATOR). Hopefully the end result will be a smaller, simpler grammar. When we first started putting together the current bison grammar for Csound 5, we ran into some issues that made Csound ORC difficult to parse. However, I'm optimistic at this point that we should be able to solve these issues more directly now that we have the type system in Csound 6 and a more certain semantic analysis phase. I'll be focusing my time on this as my primary Csound dev work for now. Any comments and feedback on the grammar as it develops is appreciated. Thanks! steven ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Csound-devel mailing list Csound-devel@lists.sourceforge.net

Date	2014-08-21 22:12
From	joachim heintz
Subject	Re: [Cs-dev] Parser3
	sounds fantastic, steven. please give a hint here if there is anything to test or to give sime feedback. all best - joachim Am 21.08.2014 um 03:42 schrieb Steven Yi: > Hi All, > > Just FYI, I have started a new project for exploring grammar changes > for Csound at: > > https://github.com/kunstmusik/parser3 > > I had done some analysis of the current parser and decided it was too > hard for me to experiment with language changes quickly with the > current parser. Instead, I started a fresh grammar specifically for > experimenting with the language, without having to worry about the > actual compilation, line numbering, error handling, and all the other > things that the real parser does in Csound. > > The parser so far is just a start. In general my working method right > now is to add one new line of ORC code to csound.orc, have the parser > fail, then go and implement the rules in the grammar and lexer. (Sort > of following the general Test-Driven Design practice of writing a > failing case first then implementing to fix the failed case.) > > To note: the idea with Parser3 is to remove a lot of the > semantic-aware things from the current parser out. This includes > removing rules like rident, bexpr, opcode, and opcode0 as these things > shouldn't be necessary to distinguish at parse-time. (Instead, they > should be processed during the semantic analysis phase.) Also, some > things may end up actually being moved into the lexer. (For example, > I've been looking at the Java Language Specification for some > inspiration (http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.12), > and thought to make new lexical tokens for OPERATOR and > ASSIGNMENT_OPERATOR). > > Hopefully the end result will be a smaller, simpler grammar. When we > first started putting together the current bison grammar for Csound 5, > we ran into some issues that made Csound ORC difficult to parse. > However, I'm optimistic at this point that we should be able to solve > these issues more directly now that we have the type system in Csound > 6 and a more certain semantic analysis phase. > > I'll be focusing my time on this as my primary Csound dev work for > now. Any comments and feedback on the grammar as it develops is > appreciated. > > Thanks! > steven > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > _______________________________________________ > Csound-devel mailing list > Csound-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/csound-devel > ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Csound-devel mailing list Csound-devel@lists.sourceforge.net

Date	2014-08-22 17:43
From	Steven Yi
Subject	Re: [Cs-dev] Parser3
	Thanks Joachim! I think there won't be much to test until I start pushing this stuff into Csound itself. Actually, I was able to move pretty quickly with the bare grammar. I think I'm actually at a point where I could start implementing this in Csound. To note, the current grammar is at: https://github.com/kunstmusik/parser3/blob/master/src/csound_orc.y Right now, that grammar has one shift/reduce issue and it's not critical as far as I can tell (taking the shift seems to do the job correctly). The really tricky part is that supporting opcode call statements has a bit of an overlap with assignment/function call statements. With the above grammar, I've figured that it will match some situations a little bit awkwardly, but on the other hand, those can be dealt with by the semantic analyzer. For example, things like these: dosomething (4) dosomething (4), 5, 6 all get matched as opcode calls. Most of them will be caught by one rule: \| out_arg_list expr_list NEWLINE So parsing the above we'd get an out_arg_list with one element called dosomething, and then expr_list with the parts starting with (4). That's odd in terms of language, but grammatically it's alright in this scenario as we can add rules at semantic analysis time to disambiguate what is going on. (i.e. we can check if the out_arg_list has only one item, if so, it becomes a candidate for an opcode call if the word matches an opcode name, etc.) I think if the semantic rules are setup correctly, this should properly handle the issues with the space between an opcode name and (). (Note: if you go through the assignment statement path, the ambiguity of () is gone as we match through the function_call rule). At this point, I'm feeling somewhat good about the state of this grammar. I'll plan to start a new branch and start experimenting with incorporating the changes into the real Csound code. On Thu, Aug 21, 2014 at 5:12 PM, joachim heintz wrote: > sounds fantastic, steven. please give a hint here if there is anything > to test or to give sime feedback. all best - > joachim > > > Am 21.08.2014 um 03:42 schrieb Steven Yi: >> Hi All, >> >> Just FYI, I have started a new project for exploring grammar changes >> for Csound at: >> >> https://github.com/kunstmusik/parser3 >> >> I had done some analysis of the current parser and decided it was too >> hard for me to experiment with language changes quickly with the >> current parser. Instead, I started a fresh grammar specifically for >> experimenting with the language, without having to worry about the >> actual compilation, line numbering, error handling, and all the other >> things that the real parser does in Csound. >> >> The parser so far is just a start. In general my working method right >> now is to add one new line of ORC code to csound.orc, have the parser >> fail, then go and implement the rules in the grammar and lexer. (Sort >> of following the general Test-Driven Design practice of writing a >> failing case first then implementing to fix the failed case.) >> >> To note: the idea with Parser3 is to remove a lot of the >> semantic-aware things from the current parser out. This includes >> removing rules like rident, bexpr, opcode, and opcode0 as these things >> shouldn't be necessary to distinguish at parse-time. (Instead, they >> should be processed during the semantic analysis phase.) Also, some >> things may end up actually being moved into the lexer. (For example, >> I've been looking at the Java Language Specification for some >> inspiration (http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.12), >> and thought to make new lexical tokens for OPERATOR and >> ASSIGNMENT_OPERATOR). >> >> Hopefully the end result will be a smaller, simpler grammar. When we >> first started putting together the current bison grammar for Csound 5, >> we ran into some issues that made Csound ORC difficult to parse. >> However, I'm optimistic at this point that we should be able to solve >> these issues more directly now that we have the type system in Csound >> 6 and a more certain semantic analysis phase. >> >> I'll be focusing my time on this as my primary Csound dev work for >> now. Any comments and feedback on the grammar as it develops is >> appreciated. >> >> Thanks! >> steven >> >> ------------------------------------------------------------------------------ >> Slashdot TV. >> Video for Nerds. Stuff that matters. >> http://tv.slashdot.org/ >> _______________________________________________ >> Csound-devel mailing list >> Csound-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/csound-devel >> > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > _______________________________________________ > Csound-devel mailing list > Csound-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/csound-devel ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Csound-devel mailing list Csound-devel@lists.sourceforge.net

Date	2014-08-22 18:07
From	Felipe Sateler
Subject	Re: [Cs-dev] Parser3
	I don't know anything about this, but... how is a return argument list followed by an expression list an opcode call? I'm confused. I'm not sure what sort of expressions that is meant to capture. PS: there are some unused lexer and parser tokens like ZERODBFS and STRING On Fri, Aug 22, 2014 at 12:43 PM, Steven Yi wrote: > Thanks Joachim! I think there won't be much to test until I start > pushing this stuff into Csound itself. > > Actually, I was able to move pretty quickly with the bare grammar. I > think I'm actually at a point where I could start implementing this in > Csound. > > To note, the current grammar is at: > > https://github.com/kunstmusik/parser3/blob/master/src/csound_orc.y > > Right now, that grammar has one shift/reduce issue and it's not > critical as far as I can tell (taking the shift seems to do the job > correctly). The really tricky part is that supporting opcode call > statements has a bit of an overlap with assignment/function call > statements. With the above grammar, I've figured that it will match > some situations a little bit awkwardly, but on the other hand, those > can be dealt with by the semantic analyzer. > > For example, things like these: > > dosomething (4) > dosomething (4), 5, 6 > > all get matched as opcode calls. Most of them will be caught by one rule: > > \| out_arg_list expr_list NEWLINE > > So parsing the above we'd get an out_arg_list with one element called > dosomething, and then expr_list with the parts starting with (4). > That's odd in terms of language, but grammatically it's alright in > this scenario as we can add rules at semantic analysis time to > disambiguate what is going on. (i.e. we can check if the out_arg_list > has only one item, if so, it becomes a candidate for an opcode call if > the word matches an opcode name, etc.) > > I think if the semantic rules are setup correctly, this should > properly handle the issues with the space between an opcode name and > (). (Note: if you go through the assignment statement path, the > ambiguity of () is gone as we match through the function_call rule). > > At this point, I'm feeling somewhat good about the state of this > grammar. I'll plan to start a new branch and start experimenting with > incorporating the changes into the real Csound code. > > > On Thu, Aug 21, 2014 at 5:12 PM, joachim heintz wrote: >> sounds fantastic, steven. please give a hint here if there is anything >> to test or to give sime feedback. all best - >> joachim >> >> >> Am 21.08.2014 um 03:42 schrieb Steven Yi: >>> Hi All, >>> >>> Just FYI, I have started a new project for exploring grammar changes >>> for Csound at: >>> >>> https://github.com/kunstmusik/parser3 >>> >>> I had done some analysis of the current parser and decided it was too >>> hard for me to experiment with language changes quickly with the >>> current parser. Instead, I started a fresh grammar specifically for >>> experimenting with the language, without having to worry about the >>> actual compilation, line numbering, error handling, and all the other >>> things that the real parser does in Csound. >>> >>> The parser so far is just a start. In general my working method right >>> now is to add one new line of ORC code to csound.orc, have the parser >>> fail, then go and implement the rules in the grammar and lexer. (Sort >>> of following the general Test-Driven Design practice of writing a >>> failing case first then implementing to fix the failed case.) >>> >>> To note: the idea with Parser3 is to remove a lot of the >>> semantic-aware things from the current parser out. This includes >>> removing rules like rident, bexpr, opcode, and opcode0 as these things >>> shouldn't be necessary to distinguish at parse-time. (Instead, they >>> should be processed during the semantic analysis phase.) Also, some >>> things may end up actually being moved into the lexer. (For example, >>> I've been looking at the Java Language Specification for some >>> inspiration (http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.12), >>> and thought to make new lexical tokens for OPERATOR and >>> ASSIGNMENT_OPERATOR). >>> >>> Hopefully the end result will be a smaller, simpler grammar. When we >>> first started putting together the current bison grammar for Csound 5, >>> we ran into some issues that made Csound ORC difficult to parse. >>> However, I'm optimistic at this point that we should be able to solve >>> these issues more directly now that we have the type system in Csound >>> 6 and a more certain semantic analysis phase. >>> >>> I'll be focusing my time on this as my primary Csound dev work for >>> now. Any comments and feedback on the grammar as it develops is >>> appreciated. >>> >>> Thanks! >>> steven >>> >>> ------------------------------------------------------------------------------ >>> Slashdot TV. >>> Video for Nerds. Stuff that matters. >>> http://tv.slashdot.org/ >>> _______________________________________________ >>> Csound-devel mailing list >>> Csound-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/csound-devel >>> >> >> ------------------------------------------------------------------------------ >> Slashdot TV. >> Video for Nerds. Stuff that matters. >> http://tv.slashdot.org/ >> _______________________________________________ >> Csound-devel mailing list >> Csound-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/csound-devel > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > _______________________________________________ > Csound-devel mailing list > Csound-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/csound-devel -- Saludos, Felipe Sateler ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Csound-devel mailing list Csound-devel@lists.sourceforge.net

Date	2014-08-22 18:50
From	Steven Yi
Subject	Re: [Cs-dev] Parser3
	Yes, it's confusing. :) It has to do with whitespace being insignificant. So if you have: a = functionName (2) That gets parsed as a statement, with T_IDENT '=' function_call. That's what you'd expect. However, what happens with the opcall rule is that we have an ambiguity: functionName (2) If we try to use function_call as a base statement, we get into a parsing problem. The parser sees the above as: T_IDENT '(' INTEGER_TOKEN ')' Now, it can actually then read that as two things, one being a function_call, the other being an out_arg and an expression (the expression being '(' INTEGER_TOKEN ')'). The use of "out_arg_list expr_list" actually catches all scenarios for opcode calls where there is two parts, i.e.: asig in asig, asig1 in out asig out asig, asig2, asig3 However, it will also catch: asig, asig in, in, in out, asig asig2, asig3 and think it's valid. That's alright though, as we will have rules in the semantic analyzer to rule out the invalid cases. Also, something like: out asig (3) will actually be caught by the third rule: out_arg_list T_IDENT expr_list NEWLINE because the (3) will be identified as an expression. Note, Csound's opcode syntax and use of expressions is by nature ambiguous. Right now the current parser addresses that by having some semantic information early on before parsing. This just flips that to say, this is acceptable at parse time, but we'll sort it out later at semantic analysis time. On Fri, Aug 22, 2014 at 1:07 PM, Felipe Sateler wrote: > I don't know anything about this, but... how is a return argument list > followed by an expression list an opcode call? I'm confused. I'm not > sure what sort of expressions that is meant to capture. > > PS: there are some unused lexer and parser tokens like ZERODBFS and STRING > > On Fri, Aug 22, 2014 at 12:43 PM, Steven Yi wrote: >> Thanks Joachim! I think there won't be much to test until I start >> pushing this stuff into Csound itself. >> >> Actually, I was able to move pretty quickly with the bare grammar. I >> think I'm actually at a point where I could start implementing this in >> Csound. >> >> To note, the current grammar is at: >> >> https://github.com/kunstmusik/parser3/blob/master/src/csound_orc.y >> >> Right now, that grammar has one shift/reduce issue and it's not >> critical as far as I can tell (taking the shift seems to do the job >> correctly). The really tricky part is that supporting opcode call >> statements has a bit of an overlap with assignment/function call >> statements. With the above grammar, I've figured that it will match >> some situations a little bit awkwardly, but on the other hand, those >> can be dealt with by the semantic analyzer. >> >> For example, things like these: >> >> dosomething (4) >> dosomething (4), 5, 6 >> >> all get matched as opcode calls. Most of them will be caught by one rule: >> >> \| out_arg_list expr_list NEWLINE >> >> So parsing the above we'd get an out_arg_list with one element called >> dosomething, and then expr_list with the parts starting with (4). >> That's odd in terms of language, but grammatically it's alright in >> this scenario as we can add rules at semantic analysis time to >> disambiguate what is going on. (i.e. we can check if the out_arg_list >> has only one item, if so, it becomes a candidate for an opcode call if >> the word matches an opcode name, etc.) >> >> I think if the semantic rules are setup correctly, this should >> properly handle the issues with the space between an opcode name and >> (). (Note: if you go through the assignment statement path, the >> ambiguity of () is gone as we match through the function_call rule). >> >> At this point, I'm feeling somewhat good about the state of this >> grammar. I'll plan to start a new branch and start experimenting with >> incorporating the changes into the real Csound code. >> >> >> On Thu, Aug 21, 2014 at 5:12 PM, joachim heintz wrote: >>> sounds fantastic, steven. please give a hint here if there is anything >>> to test or to give sime feedback. all best - >>> joachim >>> >>> >>> Am 21.08.2014 um 03:42 schrieb Steven Yi: >>>> Hi All, >>>> >>>> Just FYI, I have started a new project for exploring grammar changes >>>> for Csound at: >>>> >>>> https://github.com/kunstmusik/parser3 >>>> >>>> I had done some analysis of the current parser and decided it was too >>>> hard for me to experiment with language changes quickly with the >>>> current parser. Instead, I started a fresh grammar specifically for >>>> experimenting with the language, without having to worry about the >>>> actual compilation, line numbering, error handling, and all the other >>>> things that the real parser does in Csound. >>>> >>>> The parser so far is just a start. In general my working method right >>>> now is to add one new line of ORC code to csound.orc, have the parser >>>> fail, then go and implement the rules in the grammar and lexer. (Sort >>>> of following the general Test-Driven Design practice of writing a >>>> failing case first then implementing to fix the failed case.) >>>> >>>> To note: the idea with Parser3 is to remove a lot of the >>>> semantic-aware things from the current parser out. This includes >>>> removing rules like rident, bexpr, opcode, and opcode0 as these things >>>> shouldn't be necessary to distinguish at parse-time. (Instead, they >>>> should be processed during the semantic analysis phase.) Also, some >>>> things may end up actually being moved into the lexer. (For example, >>>> I've been looking at the Java Language Specification for some >>>> inspiration (http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.12), >>>> and thought to make new lexical tokens for OPERATOR and >>>> ASSIGNMENT_OPERATOR). >>>> >>>> Hopefully the end result will be a smaller, simpler grammar. When we >>>> first started putting together the current bison grammar for Csound 5, >>>> we ran into some issues that made Csound ORC difficult to parse. >>>> However, I'm optimistic at this point that we should be able to solve >>>> these issues more directly now that we have the type system in Csound >>>> 6 and a more certain semantic analysis phase. >>>> >>>> I'll be focusing my time on this as my primary Csound dev work for >>>> now. Any comments and feedback on the grammar as it develops is >>>> appreciated. >>>> >>>> Thanks! >>>> steven >>>> >>>> ------------------------------------------------------------------------------ >>>> Slashdot TV. >>>> Video for Nerds. Stuff that matters. >>>> http://tv.slashdot.org/ >>>> _______________________________________________ >>>> Csound-devel mailing list >>>> Csound-devel@lists.sourceforge.net >>>> https://lists.sourceforge.net/lists/listinfo/csound-devel >>>> >>> >>> ------------------------------------------------------------------------------ >>> Slashdot TV. >>> Video for Nerds. Stuff that matters. >>> http://tv.slashdot.org/ >>> _______________________________________________ >>> Csound-devel mailing list >>> Csound-devel@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/csound-devel >> >> ------------------------------------------------------------------------------ >> Slashdot TV. >> Video for Nerds. Stuff that matters. >> http://tv.slashdot.org/ >> _______________________________________________ >> Csound-devel mailing list >> Csound-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/csound-devel > > > > -- > > Saludos, > Felipe Sateler > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > _______________________________________________ > Csound-devel mailing list > Csound-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/csound-devel ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Csound-devel mailing list Csound-devel@lists.sourceforge.net

Date	2014-08-22 21:22
From	Felipe Sateler
Subject	Re: [Cs-dev] Parser3
	On Fri, Aug 22, 2014 at 1:50 PM, Steven Yi wrote: > Yes, it's confusing. :) It has to do with whitespace being > insignificant. So if you have: > > a = functionName (2) > > That gets parsed as a statement, with T_IDENT '=' function_call. > That's what you'd expect. However, what happens with the opcall rule > is that we have an ambiguity: > > functionName (2) > > If we try to use function_call as a base statement, we get into a > parsing problem. The parser sees the above as: > > T_IDENT '(' INTEGER_TOKEN ')' > > Now, it can actually then read that as two things, one being a > function_call, the other being an out_arg and an expression (the > expression being '(' INTEGER_TOKEN ')'). > > The use of "out_arg_list expr_list" actually catches all scenarios for > opcode calls where there is two parts, i.e.: > > asig in > asig, asig1 in > out asig > out asig, asig2, asig3 > > However, it will also catch: > > asig, asig in, in, in > out, asig asig2, asig3 Shouldn't this rule be instead: \| T_IDENT expr_list NEWLINE \| out_arg_list T_IDENT NEWLINE ? That should avoid catching the last 2 examples. Again, I don't know much about parsers. But it seems to me that the grammar should describe what is expected: all those examples actually want a T_IDENT, and now you are "cheating" by pretending the T_IDENT is an expr (which you can get away with because expr includes T_IDENT). But you probably tried this and it didn't work. How so? > > and think it's valid. That's alright though, as we will have rules in > the semantic analyzer to rule out the invalid cases. Also, something > like: > > out asig (3) > > will actually be caught by the third rule: > > out_arg_list T_IDENT expr_list NEWLINE > > because the (3) will be identified as an expression. > > Note, Csound's opcode syntax and use of expressions is by nature > ambiguous. Right now the current parser addresses that by having some > semantic information early on before parsing. This just flips that to > say, this is acceptable at parse time, but we'll sort it out later at > semantic analysis time. Hmm OK, I think I now see the ambiguity. And this of course makes writing a parser more difficult. These parser things go way over my head :p -- Saludos, Felipe Sateler ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Csound-devel mailing list Csound-devel@lists.sourceforge.net

Date	2014-08-22 22:06
From	Steven Yi
Subject	Re: [Cs-dev] Parser3
	Yes, it is ambiguous. I had already tried: T_IDENT expr_list NEWLINE \| out_arg_list T_IDENT NEWLINE and what happens is the parser doesn't know when it has an initial T_IDENT whether to reduce it as an out_arg_list or to shift and try to match the next value as an expr_list. This is because T_IDENT is a possible token that would match both expr and out_arg, and both of those are used for expr_list and out_arg_list respectively. But yes, this is a workaround and not ideal, but not much to be done considering Csound's language. On Fri, Aug 22, 2014 at 4:22 PM, Felipe Sateler wrote: > On Fri, Aug 22, 2014 at 1:50 PM, Steven Yi wrote: >> Yes, it's confusing. :) It has to do with whitespace being >> insignificant. So if you have: >> >> a = functionName (2) >> >> That gets parsed as a statement, with T_IDENT '=' function_call. >> That's what you'd expect. However, what happens with the opcall rule >> is that we have an ambiguity: >> >> functionName (2) >> >> If we try to use function_call as a base statement, we get into a >> parsing problem. The parser sees the above as: >> >> T_IDENT '(' INTEGER_TOKEN ')' >> >> Now, it can actually then read that as two things, one being a >> function_call, the other being an out_arg and an expression (the >> expression being '(' INTEGER_TOKEN ')'). >> >> The use of "out_arg_list expr_list" actually catches all scenarios for >> opcode calls where there is two parts, i.e.: >> >> asig in >> asig, asig1 in >> out asig >> out asig, asig2, asig3 >> >> However, it will also catch: >> >> asig, asig in, in, in >> out, asig asig2, asig3 > > Shouldn't this rule be instead: > > \| T_IDENT expr_list NEWLINE > \| out_arg_list T_IDENT NEWLINE > > ? > > That should avoid catching the last 2 examples. > > Again, I don't know much about parsers. But it seems to me that the > grammar should describe what is expected: all those examples actually > want a T_IDENT, and now you are "cheating" by pretending the T_IDENT > is an expr (which you can get away with because expr includes > T_IDENT). > > But you probably tried this and it didn't work. How so? > >> >> and think it's valid. That's alright though, as we will have rules in >> the semantic analyzer to rule out the invalid cases. Also, something >> like: >> >> out asig (3) >> >> will actually be caught by the third rule: >> >> out_arg_list T_IDENT expr_list NEWLINE >> >> because the (3) will be identified as an expression. >> >> Note, Csound's opcode syntax and use of expressions is by nature >> ambiguous. Right now the current parser addresses that by having some >> semantic information early on before parsing. This just flips that to >> say, this is acceptable at parse time, but we'll sort it out later at >> semantic analysis time. > > Hmm OK, I think I now see the ambiguity. And this of course makes > writing a parser more difficult. These parser things go way over my > head :p > > > -- > > Saludos, > Felipe Sateler > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > _______________________________________________ > Csound-devel mailing list > Csound-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/csound-devel ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Csound-devel mailing list Csound-devel@lists.sourceforge.net

Date	2014-08-22 22:33
From	Felipe Sateler
Subject	Re: [Cs-dev] Parser3
	Ah, that makes sense. Thanks for explaining! On Fri, Aug 22, 2014 at 5:06 PM, Steven Yi wrote: > Yes, it is ambiguous. I had already tried: > > T_IDENT expr_list NEWLINE > \| out_arg_list T_IDENT NEWLINE > > and what happens is the parser doesn't know when it has an initial > T_IDENT whether to reduce it as an out_arg_list or to shift and try to > match the next value as an expr_list. This is because T_IDENT is a > possible token that would match both expr and out_arg, and both of > those are used for expr_list and out_arg_list respectively. > > But yes, this is a workaround and not ideal, but not much to be done > considering Csound's language. > > On Fri, Aug 22, 2014 at 4:22 PM, Felipe Sateler wrote: >> On Fri, Aug 22, 2014 at 1:50 PM, Steven Yi wrote: >>> Yes, it's confusing. :) It has to do with whitespace being >>> insignificant. So if you have: >>> >>> a = functionName (2) >>> >>> That gets parsed as a statement, with T_IDENT '=' function_call. >>> That's what you'd expect. However, what happens with the opcall rule >>> is that we have an ambiguity: >>> >>> functionName (2) >>> >>> If we try to use function_call as a base statement, we get into a >>> parsing problem. The parser sees the above as: >>> >>> T_IDENT '(' INTEGER_TOKEN ')' >>> >>> Now, it can actually then read that as two things, one being a >>> function_call, the other being an out_arg and an expression (the >>> expression being '(' INTEGER_TOKEN ')'). >>> >>> The use of "out_arg_list expr_list" actually catches all scenarios for >>> opcode calls where there is two parts, i.e.: >>> >>> asig in >>> asig, asig1 in >>> out asig >>> out asig, asig2, asig3 >>> >>> However, it will also catch: >>> >>> asig, asig in, in, in >>> out, asig asig2, asig3 >> >> Shouldn't this rule be instead: >> >> \| T_IDENT expr_list NEWLINE >> \| out_arg_list T_IDENT NEWLINE >> >> ? >> >> That should avoid catching the last 2 examples. >> >> Again, I don't know much about parsers. But it seems to me that the >> grammar should describe what is expected: all those examples actually >> want a T_IDENT, and now you are "cheating" by pretending the T_IDENT >> is an expr (which you can get away with because expr includes >> T_IDENT). >> >> But you probably tried this and it didn't work. How so? >> >>> >>> and think it's valid. That's alright though, as we will have rules in >>> the semantic analyzer to rule out the invalid cases. Also, something >>> like: >>> >>> out asig (3) >>> >>> will actually be caught by the third rule: >>> >>> out_arg_list T_IDENT expr_list NEWLINE >>> >>> because the (3) will be identified as an expression. >>> >>> Note, Csound's opcode syntax and use of expressions is by nature >>> ambiguous. Right now the current parser addresses that by having some >>> semantic information early on before parsing. This just flips that to >>> say, this is acceptable at parse time, but we'll sort it out later at >>> semantic analysis time. >> >> Hmm OK, I think I now see the ambiguity. And this of course makes >> writing a parser more difficult. These parser things go way over my >> head :p >> >> >> -- >> >> Saludos, >> Felipe Sateler >> >> ------------------------------------------------------------------------------ >> Slashdot TV. >> Video for Nerds. Stuff that matters. >> http://tv.slashdot.org/ >> _______________________________________________ >> Csound-devel mailing list >> Csound-devel@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/csound-devel > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > _______________________________________________ > Csound-devel mailing list > Csound-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/csound-devel -- Saludos, Felipe Sateler ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Csound-devel mailing list Csound-devel@lists.sourceforge.net