false positive (Full Version)

All Forums >> [Current Games From Matrix.] >> [World War I] >> Revolution Under Siege Gold >> Mods and Scenarios



Message


berto -> false positive (12/18/2011 8:35:56 PM)


That would be in which file(s)?

That's just what I need, BTW: detailed feedback. Back and forth, give and take. Constant refinement and tweaking of both AGElint code and game data. Getting closer and closer to that (ever?) elusive goal: bug free games!




Chilperic -> RE: false positive (12/18/2011 9:37:01 PM)

You will have :-)

WHat file? the FY file or the various aliases file? The latter is in the alaiases directory.




berto -> RE: false positive (12/18/2011 10:01:09 PM)


The FY file. It's easy to locate stuff in the aliases files.




Chilperic -> RE: false positive (12/18/2011 10:12:35 PM)


quote:

ORIGINAL: berto


The FY file. It's easy to locate stuff in the aliases files.


AIAggro_WHI for example




berto -> RE: false positive (12/18/2011 10:29:38 PM)


Okay, I'll look into it.

What I've been working on over the past 4-5 days (multitasking with the gaming stuff):

Milwaukee Renaissance Band Concert, December 2011

Lest anybody think I don't have a life; that all I do is hang out at the Matrix Forum (and Modder Corner) trading postings with Chliperic/Clovis. [;)]

At Modder Corner, I had written:

quote:

Early tomorrow at the Matrix Forum, I will begin a series of posts describing (not too deeply) AGElint’s technical innards. Discussing the problem of false positives. Introducing the KWD keyword usage database. Outlining what has already been done, what still needs to be done, and how we might carry the project forward. And more.

I'm finished for now. I'll be back and doing the promised posts later. Have fun.




Chilperic -> RE: false positive (12/18/2011 10:35:51 PM)

quote:

ORIGINAL: berto


Okay, I'll look into it.

What I've been working on over the past 4-5 days (multitasking with the gaming stuff):

Milwaukee Renaissance Band Concert, December 2011

Lest anybody think I don't have a life; that all I do is hang out at the Matrix Forum (and Modder Corner) trading postings with Chliperic/Clovis. [;)]

At Modder Corner, I had written:

quote:

Early tomorrow at the Matrix Forum, I will begin a series of posts describing (not too deeply) AGElint’s technical innards. Discussing the problem of false positives. Introducing the KWD keyword usage database. Outlining what has already been done, what still needs to be done, and how we might carry the project forward. And more.

I'm finished for now. I'll be back and doing the promised posts later. Have fun.



Nice. [:)]

I'm not that impatient too. I've a life also. That's why SVF hasn't progressed as much as I had hoped. And DNO. But in the last week, FY has reached a new level. Even as is, it would be something I would remain proud.





berto -> txt.l -- the AGElint lexer (12/19/2011 9:37:35 AM)


txt.l -- the AGElint lexer

Here is a brief, but technical, overview of the AGElint txt.l lexical analyzer. You may skip this discussion entirely if you wish.

What is lexical analysis? Quoting from Wikipedia:

http://en.wikipedia.org/wiki/Lexical_analysis

quote:

In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. A program or function which performs lexical analysis is called a lexical analyzer, lexer or scanner. A lexer often exists as a single function which is called by a parser or another function.

Consider the sentence:

The rain in Spain falls mainly on the plain.

Lexical analysis might tokenize that sentence in many different ways, for example...

  • word by word:

    The
    rain
    in
    Spain
    falls
    mainly
    on
    the
    plain

  • word triplets:

    The rain in
    Spain falls mainly
    on the plain

  • every fourth character:

    The
    rain
    in
    Spai
    n fa
    lls
    main
    ly o
    n th
    e pl
    ain.

  • chars separated by the string 'ain':

    The r
    ain
    in Sp
    ain
    falls m
    ain
    ly on the pl
    ain

And many, many other possibilities besides.

The AGElint 'make' command inputs txt.l to the flex code generator to output C code -- lex.txt.c -- of hideous complexity. You shouldn't ever need to inspect lext.txt.c!

In the AGElint txt.l, you will see a series of pattern matches (so-called "regular expressions"), with -- bracketed by { and } -- some C code saying what to do with the matched character string.

A simple, and common, example:



MinDate                 { ECHO;
                          RETURNCMD(_MINDATE);
                        }


(Note: Apologies for the occasional extra blank lines preceding or following code snippets. The Matrix Forum code tags are quirky. I try to format the display nicely, but I don't always succeed!)

That is, in the data file, if the lexer (compiled txt.l code) sees the character string 'MinDate' ...

  • echo (depending) to the screen 'MinDate'
  • return to the calling parser the token _MINDATE (where internal to the C code the token _MINDATE is an integer, e.g., 866)

Some cases are a bit more complex, for example:



FlavorName              { ECHO;
                          BEGIN(NM);
                          RETURNCMD(_FLAVORNAME);
                        }


which adds the instruction



                          BEGIN(NM);


which says to begin the "start state" NM (name).

In txt.l, there is just one pattern match with the indicated NM start state:



<NM>{DL}{DLUWP}*{DL}"."*/{nDL} {
                          if (isfac(yytext) && agecmd && !mystrcasestr(agecmd, "name")) {
                            yyless(3);  /* push back all after faction name */
                            *(yytext+3) = '\0';
                            if (isdebug) {
                              fprintf(stdout, "%d %s: #%s#\n", lineno, "FACNAM2", yytext);
                            }
                            ECHO;
                            RETURNSTR(_FACNAM);
                          } else {
                            if (isdebug) {
                              fprintf(stdout, "%d %s: #%s#\n", lineno, "MNYNAM2", yytext);
                            }
                            ECHO;
                            RETURNSTR(_MNYNAM);
                          }
                        }


Translating:

When in the NM start state (only), any character string beginning with either a single digit or letter; followed by zero or more digits, letters, underscores, whitespaces, or punctuation marks; followed by a single digit or letter; followed by zero or more . (period); followed by a single non digit or non character (but don't add that single char to yytext) ...

  • if the matched text (yytext) signifies a faction (usually a sequence of three capital letters) AND agecmd is set to some char string AND agecmd does not contain the char string "name" (case insensitive) ...

    • push back to the input stream all matched chars except for the first three
    • terminate the yytext string after the third char
    • if in debug mode ...

      • print to screen: <line number> FACNAM2: #<faction name>#

    • echo (depending) to the screen the faction name
    • return to the parser the token _FACNAM, with also the matched text string (yytext)

  • else ...

    • if in debug mode ...

      • print to screen: <line number> MNYNAM2: #<the first 3 chars of yytext>#

    • echo (depending) to the screen the first 3 chars of yytext
    • return to the parser the token _MNYNAM, with also the matched text string (yytext)


Whew! That's a complicated case.

Fortunately, few of the pattern match cases are that complicated. By far, most are like the simple MinDate example show above.

How does the lexer exit the NM start state? Under several circumstances, most typically:



<INITIAL,NM>\n.		{ lineno++;
			  if (isshow_text) {
				if (isshow_lineno) {
					fprintf(txtout, "\n%6d  ", lineno);
				} else {
					fprintf(txtout, "\n");
				}
			  }
			  BEGIN(INITIAL);
			  PUTBACK(1);
			}


That is:

Whether or not in the NM start state, for every newline (end of line, \n), followed by a single character (any character) ...

  • increment the internal lineno (line number) variable
  • if the isshow_text flag is set (to TRUE) ...

    • if the isshow_lineno flag is set (to TRUE) ...

      • print to the txt output stream a newline, then the line number on the immediate next line

    • else

      • just print to the txt output stream a newline

  • revert to the INITIAL (default) start state
  • push back to the input stream the matched char after the newline

(Note that in this case, nothing is returned to the parser.)

I invite you to look around the txt.l file. If you are really brave, you might try making changes here or there, then redoing the 'make' command to recompile the agelint executable. But beware: Sometimes even the simplest change -- especially those involving the pattern match wildcard '*' (zero or more) or '+' (one or more) -- might totally screw things up, and effectively break the entire lexical analysis. Programming lexers is not for the faint hearted!

But you really don't have to understand any of this. Just know that the AGElint lexer (compiled txt.l code) passes a "tokenized" game data file to the AGElint parser (compiled txt.y code) for subsequent syntactical (and other) analysis.

An overview of the AGElint txt.y parser will follow later ...




berto -> txt.y -- the AGElint parser (12/19/2011 11:57:46 AM)


txt.y -- the AGElint parser

Here is a very brief, but technical, overview of the AGElint txt.y parser. You may skip this discussion entirely if you wish.

What is parsing? Quoting from Wikipedia:

http://en.wikipedia.org/wiki/Parser

quote:

In computer science and linguistics, parsing, or, more formally, syntactic analysis, is the process of analyzing a text, made of a sequence of tokens (for example, words), to determine its grammatical structure with respect to a given (more or less) formal grammar...

Consider the sentence:

The rain in Spain falls mainly on the plain.

Grammatically speaking, that sentence may be viewed as a sequence of

  • article [The]
  • noun [rain]
  • prepositional phrase

    • preposition [in]
    • noun [Spain]

  • verb [falls]
  • adverb [mainly]
  • prepositional phrase

    • preposition [on]
    • article [the]
    • noun [plain]

but other organizing schemes are possible.

The formal grammar for English usually specifies that articles precede nouns, that prepositional phrases begin with prepositions and end with nouns, that subject precedes predicate, and so on.

The sentence

rain The in falls Spain mainly on the plain.

violates English grammar, as does

plain The rain in Spain falls mainly on the.

Obviously, there are innumerable ways to mangle English grammar.

Note that English grammar would accept unusual, but still "correct", sentences such as

The rain in Spain on the plain mainly falls.

Although English grammar has been codified in exhaustive detail, mostly we just somehow "know" what is correct grammar or not. But in computer code, "it just doesn't sound right" is inadequate. We need to spell out the grammar as accurately and completely as possible. In AGElint, we do that in the file txt.y.

The AGElint 'make' command inputs txt.y to the bison code generator to output C code -- txt.tab.h, txt.tab.c -- of hideous complexity. You shouldn't ever need to inspect txt.tab.h or txt.tab.c!

In the AGElint txt.y, you will see a list of token declarations (see the earlier txt.l lexer description) (also some other technical stuff), followed by the AGE data file "formal grammar".

The agelint AGE data file formal grammar begins with:



start:            abilities
                | ais
                | aliases
                | diplomacies
                | ethnics
                | events
                | facattribs
                | facmods
                | factions
                | includes
                | merchandises
                | models
                | regions
                | religions
                | researches
                | rgndecisions
                | rulers
                | scripts
                | structures
                | terrains
                | units
                ;



That is, an AGE game data file must be one or the other of the listed possibilities (where '|' signifies OR).

Let's look more closely at the aliases case. The aliases stanzas are coded as:



aliases:        aliasthings
                ;

aliasthings:      aliasthing
                | aliasthings aliasthing
                ;

aliasthing:       val eq intvblist
                | val eq val
                | error
                ;


That is, a series of alias specifications (aliasthings), where an alias specification might be a val, followed by an eq (equal sign =), followed by either an intvblist or another val; else a specification might be in error (in which case the parser can recover from the mistake and move forward with the parsing).

Here are the specifications for val, eq & intvblist:



val:              alsval
                | facval
                ;

eq:             _EQ {
                  agecmdrhs = agecmd;
                  linenorhs = lineno;
                }

intvblist:        int
                | intvblist vb int
                ;


with some subsidiary specifications:



alsval:         _ALSVAL {
                    if (islist_aliases || islist_locals) {
                            fprintf(stdout, "%s:%d:%s\n", txtfile, lineno, $1+1);
                    }
                    free($1);
                }
                | gmaoptval
                | abinamval
                | abitxtval
                | dinamval
...
                | txtval
                | unitxtval
                | ldrval
                | mdlval
                ;

facval:         _FACVAL { free($1); }
                ;

int:              intpos { $$ = $1; }
                | intneg { $$ = $1; }
                ;




and on and on ...

Look rather complicated? It is!

In the txt.y formal grammar, stanzas such as int: might be viewed as subroutines. It's programming, but programming of a sort different from what you are probably used to.

As you can see, as with txt.l, in txt.y grammar components might include -- bracketed by { and } -- some C code saying what special things to do, if any.

For now, the important thing to understand is: if the pattern of tokens (output by the agelint lexer, the compiled txt.l) doesn't match one of the precisely defined sequences in the formal grammar (specified in txt.y), agelint will report a syntax error (and possibly do other stuff) -- the game data broke the grammar rules!

I invite you to look around the txt.y file. If you are really brave, you might try making changes here or there, then redoing the 'make' command to recompile the agelint executable. If you code a "little" change here or there, in the 'make' output, your conflicts line might say something like:

txt.y: conflicts: 768 shift/reduce, 523 reduce/reduce

rather than the current

txt.y: conflicts: 3 shift/reduce, 2 reduce/reduce

These "conflicts" represent ambiguities in the lexer/parser specification. Ideally you want 0 conflicts of each type, but that is difficult to impossible, depending on the coding pains you take, and possibly the language (in this case, the AGE scripting language) that you are attempting to model.

As with lexer programming, programming parsers can be very weird indeed!

But you really don't have to understand any of this. Just know that the AGElint parser checks the game data file(s) for syntactical (and other) correctness. If agelint reports a syntax error, you know who reports it (the compiled txt.y).

I will have more, possibly much more, to say about the txt.y parser in future posts. The txt.l lexer is more or less settled. It's in txt.y where most of the action (and ambiguity and controversy and mistakes ...) takes place.




Chilperic -> RE: txt.y -- the AGElint parser (12/19/2011 1:37:29 PM)

I will check alias this week. From what I've seen in your posts, most of the bugs you have found in the official games are concerning other scenarios than the GC, and FY GC has been cleaned of some aliases bugs. We'll see soon how much remain in FY.

REAL QA delivery is on move. In FY.[8D]




berto -> RE: txt.y -- the AGElint parser (12/19/2011 8:50:34 PM)


txt.y parser: room for improvement

Consider again the sentence:

The rain in Spain falls mainly on the plain.

Proper English, meaningful (if cliched).

Now consider this sentence:

The thing in somewhere acts sometime on the something.

Still proper English, but meaningful?

Parts of the agelint txt.y parser are still like that: written in a generic sort of way; good for checking the most basic AGE syntax but little else.

Here is an example from txt.y:



changeactorpool: _CHANGEACTORPOOL eq sclist
                {
                /*
                http://www.ageod.net/agewiki/SetActorPool
                Syntax:  SetActorPool = ActorUID|Identifier(n)|Value(n).....

		yes, ChangeActorPool resolves to above Wiki page for SetActorPool
		"complex command" -- we	leave this as sclist for now
                */
                }
		;

sclist:		  thing
		| sclist sc thing
		;


where thing is a generic mishmash of this or that without common theme or sequence. ('sc' refers to semicolon.)

(Again: My apologies for the extra empty space before and after code examples. The Matrix Forum code feature is quite quirky, infuriating even! I have these good things to say about AGEOD: The AGEOD Forum handles code display properly, and their forum is among the most congenial, capable, and easy-to-use anywhere.)

If you visit the AGEWiki URL

http://www.ageod.net/agewiki/SetActorPool

you will see just how "complex" this command really is, how numerous and varied its arguments are, and how precise is the argument sequence. The simple 'changeactorpool: _CHANGEACTORPOOL eq sclist' doesn't begin to capture the complexity of it all.

Now consider this sentence:

The rain in somewhere falls mainly on the something.

Proper English: 'somewhere' (as with 'Spain') and 'something' (as with 'plain') are both nouns. On the surface, the sentence conforms to correct English grammar.

But does it make much better sense?

Some parts of the agelint txt.y are like that too: written in greater specificity, but still lacking in fully useful detail.

Here is an example from txt.y:



objectives:	_OBJECTIVES eq valintsclist
		;

valintsclist:	  val sc int
		| valintsclist sc val sc int
		;

val:		  alsval
		| facval
		;
alsval:		_ALSVAL {
		    if (islist_aliases || islist_locals) {
		        fprintf(stdout, "%s:%d:%s\n", txtfile, lineno, $1+1);
		    }
		    free($1);
		}
		| gmaoptval
		| abinamval
		| abitxtval
		| dinamval
...
		| txtval
		| unitxtval
		| ldrval
		| mdlval
                ;


valintsclist (a semicolon-separated list of val/int data pairs) is more specific than the earlier, and extremely general, sclist. But you can see how the val in valintsclist can be almost anything. (And there is no checking of the int values either.) (I have omitted the facval specification for brevity.)

The txt.y parser still needs much filling out, and deepening. For now, it’s like the parser understands grade school English. I/we need to teach it high school or college English. The deeper AGElint’s understanding of “English”, the more errors it will detect.

Here is an example from txt.y of a well qualified, detailed, fully fleshed out command specification:



selectsubunits: _SELECTSUBUNITS eq selectsubunitsparms
                {
                /*
                http://www.ageod.net/agewiki/SelectSubUnits
                Syntax:  SelectSubUnits = Region <RgnUID>;Area <AreaUID>;Families <Fam1> <Fam2> ...;Models <Mdl1> <Mdl2> ... ;FactionTags <Tag1> <Tag2> ... ;Domains <_domLand> <_domNav> <_domAir>;<Attributes>;Generations <ModelGen1> <ModelGen2> ...
                */
                }
                ;
selectsubunitsparms: selectsubunitsparm
                | selectsubunitsparms sc selectsubunitsparm
                ;

selectsubunitsparm:
                  _REGION val
                | _AREA alsval
                | _FAMILIES alsvallist
                | _MODELS mdlvallist
                | _MODELS ldrvallist
                | _FACTIONTAGS faclist
                | _DOMAINS alsvallist
                | selectsubunitsattr
                | _GENERATIONS gennamlist
                | _THEATER alsvallist {
                    txterrmsg(_WARNING, TRUE, linenorhs, "suspicious, undocumented usage of Theater");
                  }    
                ;
selectsubunitsattr:
		  _ONLYFIXED
		| _ONLYNOTFIXED
		| _ENEMY
		| _FRIENDONLY
		| _FRIENDANDSELF
		| _UNIQUENAME anynam
		| _ONLYPERMFIXED
		| _ONLYNPERMFIXED
		;


Although there remains some generality (see the use of alsvallist -- a list of generic alias values), the SelectSubUnits command arguments are spelled out in considerable detail. If for example the Generations keyword is followed by something other than a gennamlist -- a ldrvallist, or an integer, or another keyword, or ... -- agelint will spot the error, and report it. No ambiguity or false positives here. (And hopefully not the converse: errors not detected or reported.)

I estimate that maybe half the txt.y parser is still generic, and half is adequately specific.

Again:

quote:

The deeper AGElint’s understanding of “English” [the AGE command language], the more errors it will detect.

Now if only we could teach the agelint txt.y parser to understand "Shakespearean English"! (That is, the AGE command language in all of its detail, richness, and nuance.)

When I write

quote:

Please bear in mind that, for all of its current sophistication, AGElint is a work-in-progress, more than a beginning but far from an ending.

this is in part what I mean. There is still much, much more to be done, and more and more bugs to detect!

I will have more to say about the txt.y parser, and other areas of extension and improvement, in the days and weeks ahead.




berto -> RE: txt.y -- the AGElint parser (12/20/2011 6:43:41 PM)


SelectFaction & SelectRegion: sweeping them under the rug

In most or all AGE Events (and other) files, you will see many SelectFaction & SelectRegion statements, for example in the file 5-Finland1918.sct:



SelectFaction = $FIN

SelectFaction = $FIN
StartEvent = evt_nam_FIN_GeneralShift1|999|0|NULL|NULL|NULL|NULL

Conditions
  MinDate = 1918/01/01
  MaxDate = 1918/04/30

Actions
  ChangeLoyaltyFac = $Theater_Finland;1

SelectFaction = $RED
  ChangeLoyaltyFac = $Theater_Finland;-1

EndEvent


(Blasted extra blank lines! [:@])

And another example:



SelectFaction = $RED
SelectRegion = $Petrograd

SelectFaction = $RED
SelectRegion = $Petrograd
StartEvent = evt_nam_RED_RedArmyReorganization|1|2|evt_txt_RED_RedArmyReorganization|Event-img_RED_RedArmyReorganization|$Petrograd|NULL

Conditions
  FixedDate = 1918/05/08

Actions
  DescEvent = evt_desc_RED_RedArmyReorganization

EndEvent


From the syntactic point of view, these SelectFaction & SelectRegion statements are inserted almost at random. They wreak havoc with the agelint formal grammar specification. Their presence totally confuses the agelint parser!

Leaving them in would cause tons of syntax "errors".

In order to deal with this problem, to ignore it really, I have "swept the problem under the rug" via:



SelectFaction{W}*={W}*$?{C}{DC}{2}/{nDC} {
                          ECHO;
                        /* the arbitrarily placed SelectFaction wreaks havoc
                           with the grammar, so we "cheat" by not returning
                           it to the parser
                         */
                        /* RETURNCMD(_SELECTFACTION); */
                        }

SelectRegion{W}*={W}*.+/\n {
                          ECHO;
                        /* the arbitrarily placed SelectRegion wreaks havoc
                           with the grammar, so we "cheat" by not returning
                           it to the parser
                         */
                        /* RETURNCMD(_SELECTREGION); */
                        }


Note how (in txt.l) I have commented out the 'RETURNCMD(_SELECTFACTION);' and 'RETURNCMD(_SELECTREGION);' statements. That is, I don't return those (tokenized) statements to the parser, so the parser (txt.y) essentially ignores them (actually is blissfully unaware of them).

In the second example, there is a logical connection between the 'SelectFaction = $RED' statements and the subsequent use of _RED_ in the StartEvent and DescEvent statements.

But, at present, the agelint parser ignores that connection.

This is one of those complexities that, in my haste to code a functioning AGElint, I have temporarily ignored. ("Haste", because I began work on AGElint at the time of a critical need for it, i.e., in June of 2011. [;)])

So, here is another area for agelint improvement: effectively reincorporating SelectFaction & SelectRegion into the parser syntactic and semantic and logical bug analysis, especially the last two. (We can still disregard them in the syntactic analysis, I think.)

And again:

quote:

The deeper AGElint’s understanding of “English" [the AGE command language], the more errors it will detect.




berto -> RE: txt.y -- the AGElint parser (12/20/2011 7:14:09 PM)


checking data values

Beyond syntactic and semantic analysis, we can also add logical analysis to the agelint lexer/parser combo.

For example, instead of using the generic int all the time:



int:              intpos { $$ = $1; }
                | intneg { $$ = $1; }
                ;


we could instead use



chance:           _ZERO { $$ = 0; }
                | _ONE { $$ = 1; }
                | _INTPOS { $$ = $1; CHKINT($1, 0, 0, 100, 100); }
                ;

pct:              _ZERO { $$ = 0; }
		| _ONE { $$ = 1; }
                | _INTPOS { $$ = $1; CHKINT($1, 0, 0, 100, 100); }
                ;


So, wherever we specify the Probability statement:



probability:      _PROBABILITY eq chance
		| _PROBABILITY eq chance sc _ESVINTVAR sc int
                {
                /*
                http://www.ageod.net/agewiki/Probability
                Syntax:  Probability = Value; <esvIntVar(x)>; <VarCoeff>
     		*/
                }
     		;


rather than the more generic



probability:      _PROBABILITY eq int
...
     		;


the parser will catch as errors probabilities > 100% or < 0%.

Another way we can check numerical values is via the CHKINT() C macro:



#define CHKINT(val, min, low, high, max) if ((val < min) || (val > max)) { \
                                                txterrmsg(_ERROR, TRUE, linenorhs, "illegal value: %ld", val); \
                                        } else if (isshow_warn && ((val < low) || (val > high))) { \
                                                txterrmsg(_WARNING, TRUE, linenorhs, "suspicious value: %ld", val); \
                                        }


Sample usage:



assault:        _ASSAULT eq intpos
                { CHKINT($3, 0, 0, 100, 999);
                /*
                Syntax:  Assault = Value
		*/
                }
                ;


Anything outside the range 0 to 100 will generate a warning message, and anything beyond 999 will generate an error message.

Still more data values we can check are date values, as in:



datestuff:      datethings {
                  if (maxdateval && mindateval) {
                    if (date2days(maxdateval) < date2days(mindateval)) {
                        txterrmsg(_ERROR, FALSE, linenorhs, "MaxDate %s precede\
s MinDate %s", maxdateval, mindateval);
                    } else if (date2days(maxdateval) == date2days(mindateval)) \
{
                        txterrmsg(_WARNING, FALSE, linenorhs, "MinDate %s same \
as MaxDate %s", mindateval, maxdateval);
                    }
                  }
                }
                ;

datethings:       datething
                | datethings datething
                ;

datething:        maxdate
                | mindate
                ;


(For the record, using AGElint, I found four instances in three different AGEOD games where MaxDate < MinDate.)

There are many more data values we can check. I have fully fleshed out the logical date analysis, and added a fair amount of CHKINT() checks, but barely scratched the surface of other kinds of data value checks.

So much to do, so little time!




berto -> RE: txt.y -- the AGElint parser (12/20/2011 7:43:21 PM)


Summarizing so far:

The agelint lexer/parser combo

  • is now adequate for checking syntax errors, but not yet fully refined
  • about half-way checks for semantic errors
  • does a fair amount of data value checks
  • has barely begun to check for other kinds of logical errors

So again:

quote:

The deeper AGElint’s understanding of “English" [the AGE command language], the more errors it will detect.

And again:

quote:

There are still many, many more things that AGElint might check [outside of agelint; in the various chk*.pl and other scripts]. Really, the possibilities are almost boundless.

(I might mention those possibilities in the AGElint to-do list in an earlier message of this AGElint thread.)

By now, AGElint has detected hundreds, thousands of bugs across the six different AGEOD AGE system games. (Only some of these results have been shared, and still fewer have been made public.) With further AGElint development, there is the possibility of detecting many, many more bugs. (And reducing, hopefully eliminating all the many now false positives. [;)])

A reminder:

quote:

NOTE: I make no claim about the significance or insignificance of any discovered bug, problem, glitch, or anomaly. Whether or not it impacts game play, or goes entirely unnoticed. Whether in the larger scheme of things it's important, or unimportant. It's up for you to decide, and maybe for us as a community to determine.

I began work on AGElint in late June 2011. (AGElint bases off of earlier work, the eu3debug project, that I did for the EU3 Magna Mundi mod.) At the time of my <cough> leave taking from AGEOD in late October 2011 -- after four months of hard, sometimes near round-the-clock effort (in order to meet "deadlines" for imminent beta and "official" game patches) -- by the time of my "departure", the state of AGElint was:

quote:

Please bear in mind that, for all of its current sophistication, AGElint is a work-in-progress, more than a beginning but far from an ending.

How much further I take AGElint development depends on a lot of things, in particular

  • actual usage (not necessarily by me; I am finished running AGElint reports for others)
  • useful, constructive feedback (let us strive to keep negativity out of the discussion)
  • possibly other coders joining the effort (so AGElint is a community Open Source project)
  • user and player enthusiasm

We'll see what develops...




Chilperic -> RE: txt.y -- the AGElint parser (12/21/2011 9:49:44 AM)

RL taking yet its toll on my free time [:D], I'm planning for the next weeks:

- to check again FY events, and check alias to fix what it should be. From my test and the 3 players PBEM currently played, FY is stable and fully enjoyable as is ( yes even in PBEM, and yet with its strong AI)
- to upload small FY updates fixing bugs when discovered and improving balance
- in the same time, using ageint on SVF 2.0 before completing AI and features
- around 15th January, uploading version 1.07 of FY, introducing the last secondary factions ( Galician Army, Caucasus Republics) and assessing better Turkestan.

A lot of work. Agelint has helped me to fix one or two bugs in the current FY version which had a real impact on AI performance, seeing how much from good it has got even better in the 1.06 version.

Agelint is THE TOOL. Those believing the contrary are wrong, as usual [8|] Past facts are eloquent, current FY state too [:)]





berto -> RE: txt.y -- the AGElint parser (12/21/2011 11:33:07 AM)


Thanks for the endorsement!

I have a favor to ask of you, Chliperic...

After perhaps reviewing my earlier "summarizing improvements/to-do" post above, please:

  • Skim through the txt.y parser file. Note the top 10 or 20 AGE commands that

    • give rise to the most bugs (in your experience)
    • are now coded in a generic fashion
    • would benefit from greater specificity (e.g., menu commands with easily bugged/mangled argument sequences)

  • Comment on the importance of re-integrating SelectFaction & SelectRegion statements into the txt.y semantic & logical analysis.
  • Suggest any other areas of urgently needed fixes/extensions/improvements. (But please don't rag on the issue of false positives. For now, please just mentally disregard them!)

I want to keep moving AGElint development forward. But I want to focus on the most important and most useful things near term, and to make the most efficient use of my limited free time.

But no coding for me over the long Christmas weekend ahead. My gaming time will instead be devoted to begin playing RUS/FY in the next day or two! [:)]




Chilperic -> RE: txt.y -- the AGElint parser (12/21/2011 1:40:54 PM)

I will in the next weeks.

And False positives aren't a concern for me. I just can't understand why it should be a breaking item...unless of course other motives are running the day [:D]





berto -> RE: txt.y -- the AGElint parser (12/21/2011 8:51:36 PM)


quote:

ORIGINAL: Chliperic

False positives aren't a concern for me. I just can't understand why it should be a breaking item...

The more time and effort I spend on avoiding false positives, the less time and effort I have for confronting true positives, real errors.

Here's the AGElint formal grammar for avoiding any false positives:



start:            things
                ;

things:           thing
		| things thing
                ;



That's it, beginning to end, nothing more, nothing less.

Voila! No false positives! And no bug reports too! [:D]

It's like I'm a musician so fearful of playing a false note that I put down my instrument and refuse to play.

Silence is golden, ignorance is bliss? [;)]




Chilperic -> RE: txt.y -- the AGElint parser (12/21/2011 11:06:08 PM)

In my event files, I've got few POSSIBLE false positive. Most were concerning the one about EvalRgnWeather, others are sparse, and to be confirmed.

Anyway, it's simple: if I don't see bug immediatly, I run a test game and look at the scriptreport: if the evnt is working, I will post here a new false positive. if not, I will have to find the bug [:D]

I don't understand why false positive should be a problem when huge games made of almost endless data haven't yet any real debugging tool [8|] Maybe because I'm mad. hey, after all, I work hard on something free [:D]




Chilperic -> RE: txt.y -- the AGElint parser (12/22/2011 1:34:12 AM)

Trying the checkalias part of Agelint, I've foud certainly another false positive

/cygdrive/c/Revolution under Siege/Revolution Under Siege/RUS/GameData/Models/1304WH3N. Sukhin.mdl:71: AIAff0_inf not found
/cygdrive/c/Revolution under Siege/Revolution Under Siege/RUS/GameData/Models/1305WH3Matsievsky.mdl:71: AIAff0_inf not found
/cygdrive/c/Revolution under Siege/Revolution Under Siege/RUS/GameData/Models/1306WH3A.I. Dutov.mdl:73: AIAff3_cav not found

The AIAff aren't defined by aliases but by AI setting file. As many leaders have in FY such a setting, the list is large ;-)




Chilperic -> RE: txt.y -- the AGElint parser (12/22/2011 1:40:05 AM)

a second uneccessary report is IHO the ones about $mdl_toe_xxx_xxx.

This line is present in any mdl file for RUS, but not for PON and is as far I know unused...Not a false positive but unuseful.

On the contrary, terrain aliases of the offical version seems to have at least 3 important errors, now fixed in FY.

BTW, the most important of this error exists in PON 1.02 too.




berto -> RE: txt.y -- the AGElint parser (12/22/2011 2:34:11 AM)


This is good and useful feedback.

These false positives are easy to fix.

In the chkaliases.pl program file, you will see:



        if (($ref =~ /\d+:.*(abi_nam_|abi_txt_|di_nam_|di_txt_|eth_nam_|evt_desc_|evt_nam_|evt_txt_|evt_|fac_|fat_|ldr_|mdl_txt_|mer_nam_|model_name_|model_shortname|nam_|opt_desc_|opt_hint_|opt_notify_|opt_title_|opt_|rel_|replaced_|rgd_nam_|rgd_shortnam_|rgd_txt_|ri_name_|ri_text_|stc_|str_|ter_nam_|ter_txt_|txt_|uni_txt_)/i) ||
            ($ref =~ /\d+:(weather_|regionname\d|str[A-Za-z0-9])/i) ||
            ($ref =~ /\d+:.*(xxx|ri_shortname_|unit_)/i) ||
            ($ref =~ /\d+:(airbase|airship|armored|artillery|baloon|boer|camel|cavalry|coastal|colonial|elite|engineer|expedition|fanatical|feudal|field|foo|foreign|fortress|guard|heavy|infantry|labour|light|marine|militia|mixed|mountain|mounted|national|partisans|peasant|pioneer|police|prospection|provincial|regular|reserve|samurai|siege|signals|slave|storm|supply|tank|urban)\s*/i) ||
            ($ref =~ /\d+:.*(asemoney)/i) ||
            (0)  # allows easy commenting out of special cases above
           ) {
            next;  # skip possible localizations, and other questionables
        } else {


Simply add '|aiaff\d_|mdl_toe_' to the 3rd of those '$ref =~' tests (could just as well use one of the others), as in



            ($ref =~ /\d+:.*(xxx|ri_shortname_|unit_|aiaff\d_|mdl_toe_)/i) ||



That should exclude those categories of false positives that you mention.

(It's possible that a similar edit like that might have to be done to the chklocals.pl file.)

That should do the trick. Please let me know.

I will be sure to fix this in the next AGElint release.

These are categories of "alias" that I had wondered about; had seen so many of them that I suspected they might be false positives. But I never got the feedback saying such until now. [8|]

Good luck, and thanks again for the feedback.




Chilperic -> RE: txt.y -- the AGElint parser (12/22/2011 10:00:17 AM)

I've finished to run the alias check. Not that much of false positive first.

Other reports are right, without fixed needed as they may result of left-over, like an unit not anymore used in the game which lacks some info...

On the contrary, I've discovered about RGD another potenytial important mispelling...




Chilperic -> RE: txt.y -- the AGElint parser (12/22/2011 12:20:07 PM)

another false positive

Armee Gruppe S<FC>d|

Agelin doesn't take into account ü




berto -> RE: txt.y -- the AGElint parser (12/22/2011 12:25:15 PM)


Ah, yes. "Odd", "non-standard" (i.e., non-ASCII) chars in "foreign" names. They are a b*tch to account for in the txt.l lexer pattern matches (regular expressions). I will make note of that false positive and correct it (and its brethren) in the next AGElint release (probably the middle of next week).

(Please revisit the chkaliases.pl edits I suggested in the third post previous. I added 'toe_' to the '|aiaff\d_|mdl_toe_' in the chkaliases.pl pattern match.)

Successfully installed FY 1.06 last night, BTW. Will begin playing it today. [:)]




berto -> RE: txt.y -- the AGElint parser (12/22/2011 1:12:21 PM)


quote:

ORIGINAL: Chliperic

Other reports are right, without fixed needed as they may result of left-over, like an unit not anymore used in the game which lacks some info...


The problem I have with this "unused" (and "testing") stuff is if leftovers have potential bugs waiting to happen. If left in, the stuff might still get used in future. And possibly confuse newbies (and more experienced modders who missed the logic behind the original inclusion, and why/how it was switched off). I would prefer that buggy older leftover/testing stuff be cleaned out. Or if left in, please somehow clearly mark it as leftover (and if a known bug remains, comment on why it was not fixed).

It's like I have C code with buggy functions, that don't get called. Or variables that never effectively get malloc()'ed (but might be), or are malloc()'ed but on a small scale (now; but what about in future?), so "who cares about free()'ing them?" (doing "garbage collection"). Or other commented out code lines that break the compiler (due maybe to mangled C syntax), but are left in without explanation as to why they are commented out. Or... Bugs all waiting to happen.

(In the agelint code, I am careful about "garbage collection," free()'ing up space on the memory heap that has no further use. At the moment, given that each agelint program run acts on just a single data file at a time, "what's the big deal about a few hundred or thousand bytes here or there left wasted on the heap?" But in future implementations, I might have one agelint program run process all data files en masse. Or morph from a one-file-at-a-time processor into a real-time, interactive data file editor/debugger that one leaves running for an extended debugging session lasting for days. Then the long-term needless memory heap growth will begin to matter. So attend to the free()s!)

Or in a former war zone, there are "just a few" live anti-personnel mines scattered across farm fields here and there. Do we leave them be and keep our fingers crossed that no kid or cow or ... will step on one accidentally and blow itself up? Do we put up warning signs -- "DANGER: land mines ahead. Proceed with caution." -- and hope for the best? (But what if a child or a foreigner can't read the sign?) Or do we assign a team of sappers to go in and clean out the mines?

IMHO, coders/devs should in almost all cases take out the trash!

So this is one of those differences of opinion where I refuse to acknowledge some cases of "false positive". In the actual current code execution, it may not matter. But in future, it really might.




berto -> reformatting chkaliases.pl output (12/22/2011 3:18:39 PM)


reformatting chkaliases.pl output

As you use chkaliases.pl (and the other chk*.pl scripts), be sure to inspect the dochk script for useful ways to reformat raw chkaliases.pl output.

(The following examples were done in my Linux setup, using my personal (Linux) agelintroot & gameroot. Be sure to substitute your own agelintroot & gameroot and make other adaptations as necessary.

Windows Cygwin users note: Any missing commands, for example awk possibly, just use the previously described Cygwin install procedure to grab the missing pieces.)

For example, a raw chkaliases.pl invocation:



[root@telemann agelint]# ./chkaliases.pl +i +E -g rus
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptGrandCampaign.ini:201: Tzarytsin not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptIceMarch1917.ini:184: Tzarytsin not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptIceMarch1917.ini:192: Tzarytsin not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptIceMarch1917.ini:192: Vladivostock not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptIceMarch1917.ini:200: Tzarytsin not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptMay1919.ini:200: Tzarytsin not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptNovember1918.ini:199: Tzarytsin not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/GameData/Units/0CMNErroneous Unit.uni:8: colCMNRegular not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/GameData/RgnDecisions/1-Tcheka.rgd:5: rgdPolitical not found
[...]


By "piping" this output to the supplied (in the AGElint distribution .zip file) grprpt.pl script, you can group this information nicely with:



[root@telemann agelint]# ./chkaliases.pl +i +E -g rus | ./grprpt.pl

/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptGrandCampaign.ini:201: Tzarytsin not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptIceMarch1917.ini:184: Tzarytsin not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptIceMarch1917.ini:192: Tzarytsin not found

/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptIceMarch1917.ini:192: Vladivostock not found

/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptIceMarch1917.ini:200: Tzarytsin not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptMay1919.ini:200: Tzarytsin not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Scripts/ScriptNovember1918.ini:199: Tzarytsin not found

/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/GameData/Units/0CMNErroneous Unit.uni:8: colCMNRegular not found

/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/GameData/RgnDecisions/1-Tcheka.rgd:5: rgdPolitical not found
[...]


Makes it easier to read, no?

You can also sort and group with:



[root@telemann agelint]# ./chkaliases.pl +i +E -g rus | sort -t: -k3 | ./grprpt.pl

/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/WHIAI.sct:1571: APiatigorsk not found

/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/WH3AI.sct:1728: aratov not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/WH3AI.sct:861: aratov not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/WH3IKolchak.sct:1608: aratov not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/WH3IKolchak.sct:746: aratov not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/WHAICZECHLEGION.sct:1790: aratov not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/WHAICZECHLEGION.sct:928: aratov not found

/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/RUS Drang Misc Events.sct:1416: Area_Bessarabia not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/RUS Drang Misc Events.sct:1419: Area_Bessarabia not found

/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/RUS Drang Options Reinforcements.sct:1731: Area_Central_Asia not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/RUS Drang Options Reinforcements.sct:1734: Area_Central_Asia not found

/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/RUS Drang Misc Events.sct:1410: Area_Moldavia not found
/media/KINGSTON/Games/AGEOD/Revolution under Siege/RUS/Events/RUS Drang Misc Events.sct:1413: Area_Moldavia not found
[...]


This will get you a list of "bad" alias refs, with instance counts (so you know where to prioritize your bug fixing efforts):



[root@telemann agelint]# ./chkaliases.pl +i +E -g rus | sort -t: -k3 | ./grprpt.pl | awk -F: '{print $NF}' | sed '1,$ s/ not found//' | sort | uniq -c | sort -nr
     48  uni_POL_Gar3
     38  Ekaterinbourg
     37  foeCounterer
     36  Ekaterinovslav
     30  Volynskyt
     26  Saruymir
     16  Rojitche
     14  Theater_Central Russia
     11  Baltrpa
     10  Tallinnn
      8  Zapozhok
      8  Ukhulovo
      8  Tzarytsin
      7  WoodedHill
      7  Lappeenrenta
      6  Brest Litvosk
[...]


This next sequence will take ~5 minutes, more or less. Go take a coffee break while it runs. Be sure to note the use of agelintroot & gameroot, and make your own substitutions as necessary!

This will list the "bad" alias refs, with suggested registered (in actual Aliases/*.ini files) alternatives:



[root@telemann Aliases]# cd /home/berto/games/ageod/agelint

[root@telemann agelint]# pwd
/home/berto/games/ageod/agelint

[root@telemann agelint]# ./chkaliases.pl +i +E -g rus | sort -t: -k3 | ./grprpt.pl | awk -F: '{print $NF}' | sed '1,$ s/ not found//' | sort | uniq -c | sort -nr > ./chkaliases.out


Note how we capture the command sequence output to the file chkaliases.out.

Then:



[root@telemann agelint]# cd /media/KINGSTON/Games/Ageod/Revolution\ under\ Siege/RUS/Aliases; for a in `awk '{print $2}' /home/berto/games/ageod/agelint/chkaliases.out`; do echo ">$a"; cat *.ini | grep -i $a; echo; done
[...]

>Ekaterinovslav

[...]

>Theater_Central
$Theater_Central_Russia = 88
$Theater_Central_Asia = 93

[...]

>aratov
$Area_Saratov = 44
$Saratov = 642
$Saratov Shore = 1011
$Saratovka = 1519


But really the easiest way to do all of this is simply to run the dochk script, as in:



[root@berto agelint]# ./dochk rus 104 20111222
doing chklint...
doing chkaliases...
doing chklocals...
doing chkimages...
doing chkfiles...
creating zipfile...
+ rm -f agelint_rus_104_20111222.zip
+ zip --to-crlf agelint_rus_104_20111222.zip chkaliases_rus_104_20111222_lst.txt chkaliases_rus_104_20111222_pat.txt chkaliases_rus_104_20111222_rpt.txt chkaliases_rus_104_20111222_sorted_rpt.txt chkfiles_rus_104_20111222_lst.txt chkfiles_rus_104_20111222_rpt.txt chkfiles_rus_104_20111222_sorted_rpt.txt chkimages_rus_104_20111222_lst.txt chkimages_rus_104_20111222_rpt.txt chkimages_rus_104_20111222_sorted_rpt.txt chklint_rus_104_20111222_error_rpt.txt chklint_rus_104_20111222_notice_rpt.txt chklint_rus_104_20111222_warning_rpt.txt chklocals_rus_104_20111222_lst.txt chklocals_rus_104_20111222_rpt.txt chklocals_rus_104_20111222_sorted_rpt.txt
+ exit 0


Then inspect the various files you see in that very long list.

(NOTE: Until I can generalize it, you will have to edit dochk with your own personal agelintroot & gameroot!)

This series of examples underscores my assertion that AGElint is now, and will forever remain, a Linux/Unix/Windows Cygwin -based toolkit. With the right OS tools, the know-how of combining those tools in complex command sequences, and a bit of cleverness, you can do all manner of "magical" data analysis on the fly. Try doing any of this stuff in standard Windows (without time-consuming custom programming). And try doing any of this with data locked away in the "official" DB/.xls files. It can't be done!

I'll be hammering hard these last points in future posts.




Chilperic -> RE: txt.y -- the AGElint parser (12/22/2011 3:27:47 PM)

A real false positive:

in /cygdrive/c/Revolution under Siege/Revolution Under Siege/RUS/Events/RUSSiberianearlypath.sct, at (or before) line 1428: syntax error: #=# (0x3d), #=#

in event evt_nam_WH3_WH3ModerateReform_Tracker:

1425
1426 Actions
1427 SelectSubUnits = FactionTags WH2 ;Area $Theater_Volga
1428> AlterCuSubUnits = ApplyToList;ChgCohesion -25
1429
1430
1431 EndEvent


:

This command is working, the syntax is right, but each occurence is marked as error by agelint.




berto -> RE: txt.y -- the AGElint parser (12/22/2011 3:43:28 PM)


quote:

ORIGINAL: Chliperic

A real false positive:

[...]
1428> AlterCuSubUnits = ApplyToList;ChgCohesion -25
[...]

Right you are!

At

http://www.ageod.net/agewiki/AlterCuSubUnit

it says

quote:

AltercuSubUnits is also a valid syntax for this command

I misssed that! [>:]

The fix, in txt.l, is simple.

Before:



AlterCuSubUnit          { ECHO;
                          BEGIN(NM);
                          RETURNCMD(_ALTERCUSUBUNIT);
                        }


And after:



AlterCuSubUnits?        { ECHO;                                                 
                          BEGIN(NM);                                            
                          RETURNCMD(_ALTERCUSUBUNIT);                           
                        }


? is a wild card matching zero or one instance of the previous regular expression, in this case the char 's'.

To be fixed in the next AGElint release.

Great feedback! Keep 'em coming!

(Even if I don't respond or act on them immediately this, ahem <cough>, holiday weekend. [8|])




Chilperic -> RE: txt.y -- the AGElint parser (12/22/2011 3:54:21 PM)

Are you realizing we are currently building a better way to create data in AGE game than in the official way? [:D]


Thanks to Agelint, I fixed some bugs present in several GEOD games since years.

I just can't wait to apply this for SVF 2.0; Development time divided by 4 to 5 at least. [:)]




berto -> RE: txt.y -- the AGElint parser (12/22/2011 4:03:29 PM)


And keep in mind: There is still much to be done, still much code broadening and deepening, potentially many more bugs to detect. "More than a beginning but far from and ending."

Must. Turn. Away. From. Forum. And. Play. Fatal Years. [:'(]




Page: <<   < prev  1 [2] 3 4 5   next >   >>

Valid CSS!




Forum Software © ASPPlayground.NET Advanced Edition 2.4.5 ANSI
0.8867188