IRC logs for #openrisc Thursday, 2014-07-10

--- Log opened Thu Jul 10 00:00:12 2014
stekerndalias: I wish I knew that ;) but this is what I'm observing: after the call to pthread_create(), I can see the parent arriving here: https://github.com/skristiansson/musl-or1k/blob/master/src/thread/or1k/clone.s#L1902:18
stekernbut, then it seems the child never get scheduled or something, since that never arrives to that same place02:19
daliasstekern, how do you observe this?03:24
daliasthis isn't the first thread test, is it?03:24
daliasso clone has already worked at least once...03:25
stekerndalias: we have a feature in or1ksim that can dump the entire instruction trace, I turned that on at entry on main() in pthread_cancel.c and then I grep for the address of that instruction in clone.s03:42
stekernand yes, the first test worked03:42
stekernand I also tested commenting the first test out, just to sanity check that the second test wouldn't work when run as the first test03:43
stekernit doesn't03:46
daliasthe syscall failing to return makes no sense...03:48
stekernI agree03:48
stekernit's of course entirely possible that this is caused by some latent kernel bug03:49
stekernsince obviously the circumstances has to be right for it to happen (since the first clone works)03:49
daliaswell if you can dump the entire instruction trace...03:50
daliaswhat happens to the new thread in kernelspace?03:50
daliasdoes it ever get scheduled?03:50
stekernyes, the answer is probably in that trace, it's just a matter of digging through it ;)03:51
daliasyes03:51
daliasdigging thru traces is no fun :/03:52
stekernI don't know if it ever get scheduled, that's what I had planned on trying to find out next03:52
stekernbut, just looking at what the circumstances might be, the difference between the first and the second test is that the first wait for the child thread to start before calling pthread_cancel03:55
stekernI've managed to compile strace against musl now too, that'll help debugging slightly04:02
stekernthis patch helped a lot in getting it to build: http://git.opensde.net/opensde/package-nopast/tree/base/musl/pkg/strace/strace-musl.patch04:04
stekernI had done half of what's in it before I found it though :/04:05
dalias:-p04:25
daliasi had forgotten opensde but yeah there are several places you should check for patches for stuff like that before spending time redoing it04:28
daliassabotage, alpine, ...04:29
stekernok, will do that in the future ;)04:37
daliasgoing to sleep now. catch you later. let me know if you make any sense of this :-p05:10
stekerndalias: sleep well, I'll let you know when I find something of interest05:16
stekernolofk_: how do you convert that?06:22
stekernI have opened it in quartus06:22
stekernbesides, I can't find anything but high level information in that?06:25
stekernor is that just some interconnect thing?06:25
maxpalnFYI, I have traced this to a bug in the memory controller - either my logic, the DDR3 IP or the memory itself. At the point of the fault being created, I see a write to memory of 0x0 then a read from the same address of 0x8701FFEC.14:54
maxpalnthanks for your help in getting me to this point14:55
-!- guilherme is now known as Guest7125115:18
stekernmaxpaln: great that your headache payed off in the end!15:35
maxpalnyeah, although now that I've found the bug fixing it might be tricky. If I'm lucky it will be someone else's :-)16:33
stekerndalias: making baby step progress on the bug, if nothing else I'm starting to get fairly familiar with parts of the code in src/thread.20:34
stekernthe bug itself is the definition of a heisenbug, it disappears, or partly changes nature if I add debug printfs20:34
stekernanyway, I have a direct question you might be able to answer20:35
stekernthis call to __timedwait() will cause a FUTEX_WAIT with the timeout argument as 0 (wait indefinetely)20:37
stekernwhere do I find the FUTEX_WAKE for that?20:38
stekern'this call' = http://git.musl-libc.org/cgit/musl/tree/src/thread/pthread_join.c#n1120:38
daliassrc/internal/futex.h i think21:02
daliasbtw is it possible that some of the args to clone are still being passed wrong?21:02
daliasif the child_tid_ptr arg is wrong, that would of course prevent pthread_join from working21:05
stekernanything is of course possible, but I've stared at those enough to believe that they should be correct now21:05
daliasthat's what makes the kernel futex_wake the tid address atomically with respect to the thread exiting21:06
stekernyes21:08
stekernthis is what a strace'd clone syscall looks like: clone(child_stack=0x300e9d18, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|0x400000, parent_tidptr=0x300e9d3c, tls=0x300e9de4, child_tidptr=0x300e9d3c) = 5221:11
daliasyeah that looks right21:13
stekernah, here's where the FUTEX_WAKE comes from: http://lxr.free-electrons.com/source/kernel/fork.c#L79421:31
daliasyes21:41
daliassorry i misread your question :(21:41
dalias<stekern> where do I find the FUTEX_WAKE for that?21:41
daliasi misread that as FUTEX_WAIT and assumed you were just asking where the constant was defined21:41
daliasi could have told you it was in kernel/fork.c right away :)21:41
stekernheh, I'm not completely incompetent with grep ;)21:42
blueCmd_stekern: jeremybennett_: https://github.com/bluecmd/or1ksim/commit/8ccb1f1677402e9103322ecba60d0370cea2bded.patch for loopback support for or1ksim22:15
--- Log closed Thu Jul 10 22:32:56 2014
--- Log opened Thu Jul 10 22:33:16 2014
-!- Irssi: #openrisc: Total of 25 nicks [0 ops, 0 halfops, 0 voices, 25 normal]22:33
-!- Irssi: Join to #openrisc was synced in 23 secs22:33
--- Log closed Fri Jul 11 00:00:14 2014

Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!