Projects

Ticket #749 (new defect)

Opened 20 months ago

Last modified 14 months ago

Regexp issues / crashes affecting StringScanner

Reported by: kitchen.andy@… Owned by: lsansonetti@…
Priority: minor Milestone: MacRuby 1.0
Component: MacRuby Keywords:
Cc:

Description (last modified by martinlagardette@…) (diff)

StringScanner fails to match a certain regular expressions properly. and diverges from MRI.

[EDIT] See comments.

Attachments

test.rb Download (0.9 KB) - added by kitchen.andy@… 20 months ago.
Unit test that exhibits the problem, run with MRI for correct output.
regex1.rb Download (1.0 KB) - added by martinlagardette@… 20 months ago.
Infinite looping
regex2.rb Download (1.0 KB) - added by martinlagardette@… 20 months ago.
Segfaulting
reproductible_regexp_crash.rb Download (1.8 KB) - added by martinlagardette@… 19 months ago.

Change History

Changed 20 months ago by kitchen.andy@…

Unit test that exhibits the problem, run with MRI for correct output.

Changed 20 months ago by martinlagardette@…

hi!

Thanks for the report.

First of all, well, you know you shouldn't be using regexps to parse a file :D (especially ruby, since you have ripper for that).
Second, I think the line you commented out is a better regexp than the one that is, in fact, not working correctly with Macruby.

In the end, we do have problems with our regexps, and StringScanner suffers from it :-(. I'll keep the bug open with more information about it

Changed 20 months ago by martinlagardette@…

Infinite looping

Changed 20 months ago by martinlagardette@…

Segfaulting

Changed 20 months ago by martinlagardette@…

  • description modified (diff)
  • summary changed from StringScanner doesn't match properly. to Regexp issues / crashes affecting StringScanner

Infinite loop uses this loop:

    while ss.skip_until(/class(?![^ \n])|def(?![^ \n])/)

The segfault one is the same with the addition of surrounding parenthesis:

    while ss.skip_until(/(class(?![^ \n])|def(?![^ \n]))/)

Changed 19 months ago by lsansonetti@…

  • milestone MacRuby 0.7 deleted

Well, I tried the regexp2.rb file with trunk using all possible versions of the while loop, and it doesn't crash. Has this problem been fixed? Can someone still reproduce it?

Changed 19 months ago by martinlagardette@…

Yep, still regexp1 and regexp2 both still crash:

[...]

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_INVALID_ADDRESS at address: 0x0000007300000000
[Switching to process 84323]
0x00007fff86d4a13e in objc_msgSend ()


(gdb) thread apply all bt

Thread 3 (process 84323):
#0  0x00007fff86d4a13e in objc_msgSend ()
#1  0x00007fff86d50f67 in finalizeOneObject ()
#2  0x00007fff852653d5 in Auto::foreach_block_do ()
#3  0x00007fff86d50b5c in batchFinalize ()
#4  0x00007fff8525f1e7 in Auto::Zone::invalidate_garbage ()
#5  0x00007fff8524fbd1 in auto_collect_internal ()
#6  0x00007fff8525016d in auto_collection_work ()
#7  0x00007fff801611b0 in _dispatch_call_block_and_release ()
#8  0x00007fff8013fd52 in _dispatch_queue_drain ()
#9  0x00007fff8013fbb4 in _dispatch_queue_invoke ()
#10 0x00007fff8013f75e in _dispatch_worker_thread2 ()
#11 0x00007fff8013f088 in _pthread_wqthread ()
#12 0x00007fff8013ef25 in start_wqthread ()

Thread 2 (process 84323):
#0  0x00007fff8013e08a in kevent ()
#1  0x00007fff8013ff5d in _dispatch_mgr_invoke ()
#2  0x00007fff8013fc34 in _dispatch_queue_invoke ()
#3  0x00007fff8013f75e in _dispatch_worker_thread2 ()
#4  0x00007fff8013f088 in _pthread_wqthread ()
#5  0x00007fff8013ef25 in start_wqthread ()

Thread 1 (process 84323):
#0  0x00007fff85265941 in Auto::Zone::block_start ()
#1  0x00007fff85252d28 in auto_zone_set_write_barrier ()
#2  0x00000001000b365f in str_replace_with_bytes (self=0x2003c9bc0, bytes=0x1035da792 ", ", len=2, enc=<value temporarily unavailable, due to optimizations>) at string.c:239
#3  0x00000001000c3816 in rb_str_new (cstr=0x1035da792 ", ", len=2) at string.c:6041
#4  0x0000000103595a18 in ?? ()
#5  0x0000000100131016 in rb_vm_dispatch (_vm=0x100f1e4a0, cache=0x100e55640, top=0, self=8590893792, klass=0x200250260, sel=0x101166180, block=0x0, opt=2 '\002', argc=<value temporarily unavailable, due to optimizations>, argv=0x7fff5fbfa3e8) at dispatcher.cpp:159
#6  0x00000001000eff70 in rb_f_send (recv=8590893792, sel=<value temporarily unavailable, due to optimizations>, argc=<value temporarily unavailable, due to optimizations>, argv=0x7fff5fbfa3e0) at vm.h:594
#7  0x00000001001310b9 in rb_vm_dispatch (_vm=0x100f1e4a0, cache=0x100e5c240, top=8590893792, self=8590893792, klass=0x200250260, sel=0x100f60440, block=0x0, opt=0 '\0', argc=1, argv=0x7fff5fbfa3e0) at dispatcher.cpp:435
#8  0x000000010355a820 in ?? ()
#9  0x00000001035947b5 in ?? ()
#10 0x0000000100130ffb in rb_vm_dispatch (_vm=0x100f1e4a0, cache=0x100e53ed0, top=8592104480, self=8590893792, klass=0x200250260, sel=0x7fff850fb103, block=0x0, opt=0 '\0', argc=<value temporarily unavailable, due to optimizations>, argv=0x7fff5fbfb170) at dispatcher.cpp:161
#11 0x000000010355a820 in ?? ()
#12 0x0000000103593ec7 in ?? ()
#13 0x0000000100133bc7 in rb_vm_yield_args (_vm=0x100f1e4a0, argc=<value temporarily unavailable, due to optimizations>, argv=0x200250260) at dispatcher.cpp:100
#14 0x00000001000edc58 in rb_yield (val=8592794976) at vm_eval.c:196
#15 0x0000000100004fdd in rary_each (ary=8590895104, sel=<value temporarily unavailable, due to optimizations>) at array.c:1064
#16 0x0000000100131016 in rb_vm_dispatch (_vm=0x100f1e4a0, cache=0x100e58280, top=8592104480, self=8590895104, klass=0x20006d540, sel=0x100f51810, block=0x2000e5b60, opt=0 '\0', argc=<value temporarily unavailable, due to optimizations>, argv=0x0) at dispatcher.cpp:159
#17 0x000000010355a820 in ?? ()
#18 0x00000001035933bf in ?? ()
#19 0x0000000100133bc7 in rb_vm_yield_args (_vm=0x100f1e4a0, argc=<value temporarily unavailable, due to optimizations>, argv=0x20006d540) at dispatcher.cpp:100
#20 0x00000001000edc58 in rb_yield (val=8592360032) at vm_eval.c:196
#21 0x0000000100004fdd in rary_each (ary=8590852992, sel=<value temporarily unavailable, due to optimizations>) at array.c:1064
#22 0x0000000100131016 in rb_vm_dispatch (_vm=0x100f1e4a0, cache=0x100e58280, top=8592104480, self=8590852992, klass=0x20006d540, sel=0x100f51810, block=0x2000f4500, opt=0 '\0', argc=<value temporarily unavailable, due to optimizations>, argv=0x0) at dispatcher.cpp:159
#23 0x000000010355a820 in ?? ()
#24 0x0000000103592ba7 in ?? ()
#25 0x0000000100130ffb in rb_vm_dispatch (_vm=0x100f1e4a0, cache=0x100e5bd00, top=8592104480, self=8592104480, klass=0x200279c80, sel=0x10530d060, block=0x0, opt=2 '\002', argc=<value temporarily unavailable, due to optimizations>, argv=0x7fff5fbfe328) at dispatcher.cpp:161
#26 0x000000010355a820 in ?? ()
#27 0x0000000103587fc5 in ?? ()
#28 0x0000000100130ffb in rb_vm_dispatch (_vm=0x100f1e4a0, cache=0x100e57b90, top=8592530560, self=8592104480, klass=0x200279c80, sel=0x7fff850fb103, block=0x0, opt=0 '\0', argc=<value temporarily unavailable, due to optimizations>, argv=0x7fff5fbff090) at dispatcher.cpp:161
#29 0x000000010355a820 in ?? ()
#30 0x00000001035879e1 in ?? ()
#31 0x00000001001329a9 in rb_vm_block_eval (b=0x100f1e4a0, argc=<value temporarily unavailable, due to optimizations>, argv=0x1) at dispatcher.cpp:98
#32 0x00000001001419ca in rb_rescue2 (b_proc=<value temporarily unavailable, due to optimizations>, data1=<value temporarily unavailable, due to optimizations>, r_proc=0x10002f8d0 <rb_end_proc_call_catch>, data2=0) at vm.cpp:3361
#33 0x000000010002fea0 in rb_exec_end_proc [inlined] () at /Users/naixn/Documents/Projets/MacRuby/eval_jump.c:483
#34 ruby_finalize_0 [inlined] () at /Users/naixn/Documents/Projets/MacRuby/eval.c:83
#35 0x000000010002fea0 in ruby_finalize () at eval_jump.c:97
#36 0x000000010008fd60 in rb_exit (status=0) at process.c:2473
#37 0x0000000100000cff in main (argc=2, argv=0x100f1de00, envp=<value temporarily unavailable, due to optimizations>) at main.cpp:40
(gdb) 

When reading the console, I sometimes get that:

Thread 0 Crashed:  Dispatch queue: com.apple.main-thread
0   libicucore.A.dylib            	0x00007fff88fbd048 icu::RegexMatcher::~RegexMatcher() + 106
1   libmacruby.dylib              	0x00000001000a7e43 rb_unicode_regex_new_retained + 2243
2   libmacruby.dylib              	0x00000001000a7eee rb_unicode_regex_new_retained + 2414
3   libmacruby.dylib              	0x00000001001310b9 rb_vm_dispatch + 6793
4   ???                           	0x0000000102f00820 0 + 4344252448
5   ???                           	0x0000000102f2303e 0 + 4344393790

But not always. Also, it does not always crash. But when it doesn't crash, it goes into infinite loop, which isn't expected at all either (just like in regexp1).

Changed 19 months ago by lsansonetti@…

Looks like it crashes during finalization.

Could you reduce the crash to one single file and attach it here?

Changed 19 months ago by martinlagardette@…

Changed 19 months ago by lsansonetti@…

OK I now repro this all the time in 10.6.4.

Changed 14 months ago by mattaimonetti@…

I can still reproduce the problem with the following stack trace:

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0x0000007300000018
Crashed Thread:  0  Dispatch queue: com.apple.main-thread

Application Specific Information:
objc[47662]: garbage collection is ON

Thread 0 Crashed:  Dispatch queue: com.apple.main-thread
0   libicucore.A.dylib            	0x00007fff89085048 icu::RegexMatcher::~RegexMatcher() + 106
1   libicucore.A.dylib            	0x00007fff890bb7e3 uregex_find + 136
2   libicucore.A.dylib            	0x00007fff890bc4a7 uregex_close + 35
3   libmacruby.dylib              	0x00000001000b9631 rb_unicode_regex_new_retained + 673
4   libmacruby.dylib              	0x00000001000b976e rb_unicode_regex_new_retained + 990
5   libmacruby.dylib              	0x000000010014a6f5 rb_vm_dispatch + 6981
6   ???                           	0x0000000100ef7bbc 0 + 4310662076
7   ???                           	0x0000000100ef8f5b 0 + 4310667099
8   libmacruby.dylib              	0x000000010014acad rb_vm_dispatch + 8445
9   ???                           	0x0000000100ef7bbc 0 + 4310662076
10  ???                           	0x0000000100ef8a46 0 + 4310665798
11  libmacruby.dylib              	0x000000010014ad2b rb_vm_dispatch + 8571
12  ???                           	0x0000000100ef7bbc 0 + 4310662076
13  ???                           	0x0000000100ef73b1 0 + 4310660017
14  libmacruby.dylib              	0x0000000100161ca3 rb_vm_run + 531
15  libmacruby.dylib              	0x00000001000406f0 ruby_run_node + 80
16  macruby                       	0x0000000100000d28 main + 152
17  macruby                       	0x0000000100000c88 start + 52

Thread 1:
0   libSystem.B.dylib             	0x00007fff8355ef8a __workq_kernreturn + 10
1   libSystem.B.dylib             	0x00007fff8355f39c _pthread_wqthread + 917
2   libSystem.B.dylib             	0x00007fff8355f005 start_wqthread + 13

Thread 0 crashed with X86 Thread State (64-bit):
  rax: 0x0000000102e99908  rbx: 0x0000000102e99890  rcx: 0x00007fff70435630  rdx: 0x00000000001f8100
  rdi: 0x0000007300000018  rsi: 0x0000000102e00000  rbp: 0x00007fff5fbfc210  rsp: 0x00007fff5fbfc200
   r8: 0x0000000000000001   r9: 0x0000000102e99960  r10: 0x0000000102e96920  r11: 0x0000000000000008
  r12: 0x0000000000000073  r13: 0x0000000000000000  r14: 0x0000000000000000  r15: 0x000000020005f600
  rip: 0x00007fff89085048  rfl: 0x0000000000010206  cr2: 0x0000007300000018

Changed 14 months ago by lsansonetti@…

I wonder if it will also crash with ICU 4.2.

Changed 14 months ago by lsansonetti@…

  • milestone set to MacRuby 1.0

According to watson, it doesn't crash anymore with ICU 4.4.2. Like #1016, we might need to use a newer ICU.

Note: See TracTickets for help on using tickets.