IRC logs for #openrisc Wednesday, 2014-07-16

--- Log opened Wed Jul 16 00:00:21 2014
daliasstekern, malloc-brk-fail is known to fail for static02:31
stekerndalias: oh, actlually, it's only the static case that fails02:38
stekernwhat's the reason for that?02:38
daliasit's a limitation in __simple_malloc that gets used because there's no realloc or free02:57
stekernah03:02
stekerndid you see the note about microblaze bits/stat.h btw?03:03
daliaswhen we made malloc work without brk, i forgot to do __simple_malloc too. but it's unlikely that static programs would have a brk issue anyway03:03
daliasi might fix it later anyway tho03:03
daliasyes i saw that03:03
daliasbut i think microblaze is working03:03
daliasare you sure it uses asm-generic?03:03
daliasi'll check it...03:04
stekernif this doesn't mean it does, then I'm confused: http://lxr.free-electrons.com/source/arch/microblaze/include/uapi/asm/stat.h03:05
daliashm03:06
daliasthe test for microblaze passes...03:22
daliasahh i suspect the test does not catch the breakage03:23
stekernwhich test? it was sem_open that broke for me03:24
daliassrc/functional/stat.c03:24
stekernall offsets except st_ino are correct03:24
daliashmm it looks like sem_open probably succeeds or fails at random depending on junk with the wrong st_ino offset03:26
daliasi can't get it to fail on microblaze with qemu user here03:26
stekerninteresting, I guess I was lucky that it failed for me then ;)03:30
dalias:)03:31
stekernwhat host are you on?03:31
dalias?03:35
stekernnah, that can't be it... I was thinking the stat conversion in qemu-user would make the data always correct03:35
stekernbut I looked up how it's done, and it clears the target stat struct first03:36
stekern...so there shouldn't even be junk there...03:36
stekernah, actually... I was looking at the wrong place, microblaze qemu-user copies st_ino to both offsets...03:42
daliasoh? haha03:43
daliaswhy?03:43
stekernhttp://git.qemu.org/?p=qemu.git;a=blob;f=linux-user/syscall.c;h=a50229d0d72fc68966515fcf2bc308b833a3c032;hb=HEAD#l494903:46
stekernhttp://git.qemu.org/?p=qemu.git;a=blob;f=linux-user/syscall_defs.h;h=c9e6323905486452f518102bf40ba73143c9d601;hb=HEAD#l146903:46
stekernno idea03:47
dalias.....03:48
daliasqemu has it wrong03:48
daliasthey truncate the earlier one to 32 bits :/03:48
daliasdespite it apparently being the correct one03:48
daliasthis looks really bad03:50
stekernyeah, wonder where that has even came from?03:51
daliasno idea03:55
daliasemailing several lists about it03:57
daliasshall i cc you?03:57
stekernout of curiosity, I looked in the kernels git log, and the kernel stat64 has never looked like that03:58
stekernsure, it'd be interesting to hear if there is any explanation to it03:58
daliaswhat address?03:58
stekernstefan.kristiansson@saunalahti.fi03:59
daliasi think we need to add an arch-specific __stat_fixup function...04:00
daliasif nothing else, mips seems to still be broken04:00
daliasmips idiotically has 32-bit dev_t still04:00
daliasand there's padding for plenty more04:00
daliasbut the padding is situated such that when we define it as 64-bit in userspace, the lo/hi halves are backwards on big endian04:01
daliasmaybe there'll be a way to work around this nasty microblaze qemu/kernel mismatch too with such a function04:02
daliastho i doubt it04:02
daliasstekern, i think the broken files on our side came from arm04:08
daliaswhich is where many/most ports were initially forked from04:08
dalias(for arm, this stat.h is right)04:09
stekernah, right. the initial commit of stat.h for microblaze is identical to the arm one04:11
stekernthe math errors are because I don't have a fenv implementation04:18
stekernmicroblaze qemu-user was actually correct prior to this 'fix'... http://git.qemu.org/?p=qemu.git;a=commitdiff;h=a523eb06ec3fb2f4f4f4d362bb23704811d1137905:01
maxpalnLife is so much easier with console output :-)09:10
maxpalnis there a safe way to write to the console during the boto sequence - printk() seems to cause exceptions depending on where it sits.09:10
stekernmaxpaln: printk *should* be safe in most places10:14
chan1hello, someone please help!12:24
chan1I was following http://openrisc.net/toolchain-build.html.12:24
chan1built the toolchain easy way,12:24
chan1then, am I supposed to go directly to building Busybox? (skipping install linux headers, stage 2 gcc, and uClibc)12:24
chan1That's what I did and I have an error building busybox.12:24
stekernchan1: http://juliusbaxter.net/openrisc-irc/%23openrisc.2014-07-07.log.html#t13:2212:27
maxpalnstekern: Taken to using pr_info() - it seems safer, although I have to be honest, I am using it blindly - it was the way I output to the UART when debugging the Ethernet PHY drivers several months ago.12:27
stekernmaxpaln: pr_info() is just a wrapper to printk12:28
stekernhttp://lxr.free-electrons.com/source/include/linux/printk.h#L22612:28
maxpalnOh, odd - it seems to cause fewer exceptions! Oh well....12:28
maxpalnout of interest, I get periodic I-TLB Miss exceptions - I am not paying particular attention to these at the moment as they get handled safely. But are they indicative of deeper issues?12:29
stekernmaxpaln: I would guess it's just old mr Heisenbug that is visiting you ;)12:29
maxpaln:-)12:29
stekernTLB-miss exceptions are perfectly normal, and completely expected12:30
maxpalnexcellent - finally a break! :-)12:30
stekernthe TLB is just a pagetable cache, and the TLB-miss happens when the cache doesn't contain the pagetable entry for the requested address12:30
chan1stekern:oh thank you :-)12:35
stekernyou're welcome12:37
maxpalnstekern: that's what I figured - but nice to have an assumption confirmed every now and again :-)12:45
chan1stekern : sorry but the link for the precompiled binary 1.0rc1 for CentOS-5.5 x86_64 doesn't work. I'll try installing from source. (someone please check why the link is dead..)12:55
chan1I mean in this page    http://opencores.org/or1k/OpenRISC_GNU_tool_chain12:55
stekernchan1: I was pointing you to: http://opencores.org/or1k/OpenRISC_GNU_tool_chain#Linux_.28uClibc.29_toolchain_.28or1k-linux-uclibc.2913:16
stekernnot the old precompiled toolchain13:16
maxpalnI am getting a little stuck - Linux is pausing during boot after the 'Mount-cache hash table entries: 1024' message13:26
maxpalntracing through I can see it gets stuck during the initialisation of something (not sure what the actual construct is) -  the function call stack looks like this:13:28
maxpalnstart_kernel->rest_init->kernel_init->kernel_init_freeable->do_pre_smp_initcalls->do_one_initcall13:31
maxpalnIt enters this function twice - the first time to execute spawn_ksoftirqd() from address c01c17e0 - this one completes fine13:32
maxpalnthe second time it executes init_workqueues() from c01c1f3813:33
maxpalntracing a combination of debug printk's and watching the instructions in HW I have traced the code as far as init_worker_pool() - unfortunately adding printk's into this function seems to hang the Linux boot earlier so I am back to tracing through HW13:34
maxpalnDoes anyone have a birds eye view on this - what is the kernel actually doing at this stage?13:35
maxpalnIt would be useful to have a broader appreciation as it might point me at the root cause a little quicker than my current strategy13:35
stekernmaxpaln, presumably there is still some hw bug that causes this, right?13:49
stekernand you have random crashes when add debug printks13:50
stekernit's not fun, but the way I'd move forward in such cases is take a build where it crashes non-subtle and start inspect that from the hardware side13:51
maxpalnah, so you think the crash from printk is causing this - interesting13:52
maxpalnor at least - the boot hang with printk is indicative of a HW bug13:53
maxpalninteresting, hadn't thought of that13:53
maxpalnI agree the problem is likely to be HW13:53
maxpalnbut it could also be that I haven't correctly configured the Linux kernel per the HW13:53
maxpalnI am pretty happy the memory controller is doing the right thing now13:54
maxpalnand the basic ORPSOC is the same as the one I have previously had working on the ECP3 silicon (predecessor to the current silicon I am using)13:54
maxpalnbut there have been numerous minor changes13:55
maxpalnI can print to the UART during boot - I am using that a lot at the moment13:55
maxpalnI am getting hangs when placing the printk's at certain points -13:56
maxpalnInspecting on HW side is straight forward - I am tracing through the instructions in HW, comparing against the disassembled Kernel and using printks as a guide. Its working so far13:56
stekernwhat are the certain points?14:02
stekernand, does the same kernel work in or1ksim?14:03
maxpalnyep, it simulates in or1ksim14:20
maxpalnI haven't really been paying attention to the points at which the printk causes the boot to hang14:20
maxpalnI hadn't really made the connection when it last happened14:20
maxpalnI just assumed there was a genuine reason why printk wouldn't work during some functions14:21
maxpalnI will try and add one to init_workqueues() - I have traced the code as far as here, I think this caused the problem last time I tried it.14:22
maxpalnhmmm, that seems to work14:30
maxpalni'll return to using the printk's and take note of the location that causes problems when it arises again14:31
maxpalnok, I need to pop out for an hour or so - but I have tracked the system hang to init_workqueues - the following call never gets returned:14:38
maxpalnsystem_unbound_wq = alloc_workqueue("events_unbound", WQ_UNBOUND,14:38
maxpaln    WQ_UNBOUND_MAX_ACTIVE);14:38
maxpalnI am not sure I understand what this code is doing - will look a little later when I get back.14:38
rahhttp://hardware.slashdot.org/story/14/07/16/1218238/sricambridge-opens-cheri-secure-processor-design14:42
olofkrah: I think some of the guys from that project were at last year's orconf15:54
rahis that good because these people with money are paying attention to openrisc, or bad because they went on and developed their own core anyway? :-)16:07
stekerndalias: from my point of view the or1k port is pretty much in shape to be merged, how do you want me to move forward with it? Should I post a patch to the musl ml?17:44
stekernI've already squashed the commits together and added a quite descriptive commit message on them already: https://github.com/skristiansson/musl-or1k/commit/a937ef3a8e4dac07fbd4e7e7777aaa0552780dc017:45
daliasi'll take a look17:49
daliasyou can go ahead and post to the mailing list tho if you like17:49
stekerngreat17:49
stekernsure17:49
blueCmd_olofk: stekern: http://bluecmd.github.io/19:44
blueCmd_I'm playing with jekyll and github pages. the way it works is that you have a git repository with static files that is then used to generate a static webpage that github (or another provider) use19:45
blueCmd_pros: it's git, so we can accept pull requests and stuff like that. cons: harder to edit19:45
blueCmd_pro: it's not opencores.net and it's not a wiki19:51
blueCmd_hm, who has all this money? http://opencores.org/donation19:55
stekernhttp://opencores.org/donation,faq19:56
stekern"then the money will used to upgrade the server hardware and Opencores"19:56
stekernhaven't you noticed the vast improvements?19:57
blueCmd_stekern: ah. I guess the downtime is due to all the upgrades they are making19:57
stekernmust be19:57
blueCmd_stekern: do you know if someone owns OpenCores as a trademark?20:00
blueCmd_it says on the website that it's registered trademark, but I don't believe that20:01
stekernI have no idea20:06
daliasstekern, does or1k have fpu and fenv that will eventually be supported? or is it all soft-float?21:26
stekerndalias: it does, but all implementation doesn't have support for it21:29
daliasi see21:29
daliasso are there separate hard and soft float abis?21:29
daliasor is the calling convention the same either way?21:29
stekernyes, same calling convention21:30
daliasnice21:30
stekernthere are no seperate fpu regs and so on21:30
daliasvery very nice21:30
daliasso we don't need two separate abi variant subarchs for that21:30
stekernright21:30
daliasbtw what about endianness? just one or both?21:31
stekernwell, in theory, the architecture is bi-endian. but in practice, there are no little-endian implementations21:32
daliasi see21:33
stekernand while there are *some* little endian support in the toolchains, it's far from complete21:33
daliasso for now we can just treat it as fixed big-engian21:33
daliasif little is needed later it can be added as the non-default subarch21:33
stekernyup21:33
daliasso it looks like there's no need for any subarchs right now21:33
daliasthat makes adding it nice and clean :)21:33
daliashow was the jmp_buf size issue handled?21:41
daliasstekern, also, in __unmapself...21:42
daliasyou don't load any args for the syscalls. i'm guessing the args are already in the right registers for munmap21:42
daliasbut for SYS_exit, the arg should be 021:43
daliasactually maybe it doesn't matter21:43
daliasi don't think this code is reachable if the exiting thread is the last thread21:43
daliasand the exit code to SYS_exit is only seen for the last exiting thread21:43
stekerndalias: hmm, yes. looking at the other archs, it is a bit of a mix whether 0 is loaded as the arg21:52
stekernregarding jmp_buf size, blueCmd_ and I discussed that and decided to change glibc to reflect what musl does21:59
blueCmd_(y)22:00
stekernthe glibc port is still in a experimental state22:00
blueCmd_I was very hard to convince22:00
blueCmd_at this point I think musl might be more stable22:00
dalias:)22:07
daliasstekern, why are the SYS_ macros not in the same order as the __NR_ ones?22:30
blueCmd_http://bluecmd.github.io/or1k.html - that turned out quite nice for a simple ODT -> HTML23:30
--- Log closed Thu Jul 17 00:00:23 2014

Generated by irclog2html.py 2.15.2 by Marius Gedminas - find it at mg.pov.lt!