[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Sc-devel] regexp support revisited :)

On Monday 26 November 2007, Florian Schmidt wrote:
> On Monday 26 November 2007, Florian Schmidt wrote:
> > On Monday 26 November 2007, Jan Trutzschler wrote:
> > > ok, i checked it in ...
> > > could someone please try to update recompile and see if this works:
> > >
> > > "foobar".findRegexp("o*bar");
> > >
> > > "32424 334 /**aaaaaa*/".findRegexp("/\\*\\*a*\\*/");
> > >
> > > "foobar".findRegexp("(o*)(bar)");
> > >
> > > "aaaabaaa".findAllRegexp("a+");
> >
> > I don't see the change here yet.. Is anonymous svn lagging on SF?
> Oh i do see it. As you said you moved it to another file..

Funny btw: This mailinglist seems to remove the CC addresses. Or did i do it 
by myself? I don't think so. I have a habit of hitting "reply all" on ML 

Anyways, looking at your code i have a few comments/questions:

[1] I don't yet quite understand the linking issues that forced you to move 
the code to a seperate file. Mind to explain? Maybe we can work it out..

[2] Since the presence of libicu is not a question of "Mac OS X or no Mac OS 
X" but rather simply whether libicu is installed we should probably simply 
add a test for libicu in the SConstruct. If this test fails we could fallback 
to the libc version (this means we have to define an preprocessor macro e.g. 
HAVE_ICU which can then be used in the code to conditionally include the one 
or the other version). I'm not too familiar with scons, so i wouldn't know 
where to test for the lib and how to define the preprocessor macro. Any hints 

[3] It seems you have taken over a quirk from my libc implementation which was 
kinda forced on me by the way the libc regex matching API is designed. In the 
libc regex API you must perallocate an array for the different group matches 
and pass that on to the regexec function along with the size of this array. 

int regexec(const regex_t *preg, const char *string, size_t nmatch, regmatch_t 
pmatch[], int eflags);

This was the factor which forced me to define this MAX_NUM_OF_MATCHES macro 
which was just to have a well visible place to tune this API imposed limit 
(There would have been a way around it by shifting through the input string 
manually always using a MAX_NUM_OF_MATCHES == 1 but i was too lazy at the 
time. I was kinda awaiting a roar of outrage about my laziness). It seems to 
me the ICU API does not show such a design (i only briefly looked over it).. 
Looking at the ICU API it seems we could simply do a single while loop which 
consecutively fetches the next match, resizes the output array accordingly 
and stores the result in it..

Do you agree/disagree on this? Or do i miss something? If so, just shout at 
me ;)


Palimm Palimm!