Discussion:
Building OpenAFS-1.6.0 on Oracle Solaris 11 1111 fails
Ulrich Pralle
2011-12-01 15:40:01 UTC
Permalink
Hi,

compilation of OpenAFS-1.6.0 fails on Oracle Solaris 11 1111 (SunOS 5.11
version 11.0).

My test machine is a fresh installtion using Oracle Solaris 11 automated
installer (AI) plus Oracle SolStudio 12.2. The machine is a logical domain
(LDOM) on a Sun-Blade-T6320 (SPARC T2, sun4v), the primary LDOM is running
Oracle Solaris 11 1111. I installed the following additional packages:
pkg:/group/system/solaris-large-server, pkg:/group/system/solaris-desktop,
pkg:/text/locale, and pkg:/system/header.

# uname -a
SunOS troll 5.11 11.0 sun4v sparc SUNW,Sun-Blade-T6320

I configured and compiled OpenAFS-1.6.0 with the following commands:

#!/bin/sh
f=/opt/SUNWspro; [ -h $f ] && /bin/rm -f $f && ln -s /opt/solstudio12.2 $f && ls -l $f
export PATH=/usr/sbin:/sbin:/usr/bin:/usr/sfw/bin;
LDFLAGS='-L/usr/lib:/usr/vice/lib -R/usr/lib:/usr/vice/lib' ./configure --prefix=/usr/vice
make

The compilation stops at openafs-1.6.0/src/sys/glue.c:

/opt/SUNWspro/bin/cc -I/export/openafs-1.6.0/src/config -I/export/openafs-1.6.0/include -I. -I. -dy -Bdynamic -c ./glue.c
"./glue.c", line 125: warning: implicit function declaration: _IOW
"./glue.c", line 125: syntax error before or at: struct
cc: acomp failed for ./glue.c
*** Error code 2
make: Fatal error: Command failed for target `glue.o'
Current working directory /export/openafs-1.6.0/src/sys

_IOW is defined in /usr/include/sys/ioccom.h, but not included from
openafs-1.6.0/include/afs/param.h, because BSD_COMP isn't (yet) defined.
I tried the following quick and dirty fix:

diff -rc openafs-1.6.0-orig/src/sys/glue.c openafs-1.6.0/src/sys/glue.c
*** openafs-1.6.0-orig/src/sys/glue.c Di. Aug 16 14:26:14 2011
--- openafs-1.6.0/src/sys/glue.c Do. Dez 1 14:34:41 2011
***************
*** 102,107 ****
--- 102,108 ----
#endif

#ifdef AFS_SUN511_ENV
+ #include <sys/ioccom.h>
int
ioctl_sun_afs_syscall(long syscall, uintptr_t param1, uintptr_t param2,
uintptr_t param3, uintptr_t param4, uintptr_t param5,

Compilation then succeeds with some annoying warnings (and kdump.c cannot be
compiled). Loading the kernel module libafs.nonfs.o fails with

unix: [ID 819705 kern.notice] /kernel/drv/sparcv9/afs: undefined symbol
unix: [ID 826211 kern.notice] 'afs_vcount'
unix: [ID 819705 kern.notice] /kernel/drv/sparcv9/afs: undefined symbol
unix: [ID 826211 kern.notice] 'afs_cacheStats'
unix: [ID 472681 kern.notice] WARNING: mod_load: cannot load module 'afs'

After fiddling about with src/libafs/Makefile.common.in, the linkage of
libafs.{nonfs,}.o succeeds with all afs-symbols resolved:

diff -rc openafs-1.6.0-orig/src/libafs/Makefile.common.in openafs-1.6.0/src/libafs/Makefile.common.in
*** openafs-1.6.0-orig/src/libafs/Makefile.common.in Di. Aug 16 14:26:14 2011
--- openafs-1.6.0/src/libafs/Makefile.common.in Do. Dez 1 15:31:49 2011
***************
*** 344,350 ****
afs_chunk.o: $(TOP_SRC_AFS)/afs_chunk.c
$(CRULE_NOOPT)
afs_daemons.o: $(TOP_SRC_AFS)/afs_daemons.c
! $(CRULE_NOOPT)
afs_dir.o: $(TOP_SRCDIR)/dir/dir.c
$(CRULE_NOOPT)
afs_icl.o: $(TOP_SRC_AFS)/afs_icl.c
--- 344,350 ----
afs_chunk.o: $(TOP_SRC_AFS)/afs_chunk.c
$(CRULE_NOOPT)
afs_daemons.o: $(TOP_SRC_AFS)/afs_daemons.c
! $(CRULE_OPT)
afs_dir.o: $(TOP_SRCDIR)/dir/dir.c
$(CRULE_NOOPT)
afs_icl.o: $(TOP_SRC_AFS)/afs_icl.c


But sadly, the kernel module panics Solaris instantly after afsd -stat 2000
-dcache 800 -daemons 3 -volumes 70 -afsdb has been started:

panic[cpu4]/thread=3000b523500: BAD TRAP: type=31 rp=2a101670ba0 addr=98 mmu_fsr=0 occurred in module "genunix" due to a NULL pointer dereference

afsd: trap type = 0x31
addr=0x98
pid=20418, pc=0x11b8928, sp=0x2a101670441, tstate=0x4411001606, context=0x916
g1-g7: 30026e94180, 1, 1, 3001b6526a0, 0, 0, 3000b523500

000002a1016708f0 unix:die+7c (31, 2a101670ba0, 98, 0, 0, 10d5800)
%l0-3: 0000000000000031 0000000001000000 0000000000002000 00000000010d58d8
%l4-7: 00000000010d5800 0000004411001606 0000000000000005 000002a1016709b0
000002a1016709d0 unix:trap+9a0 (2a101670ba0, 1c00, c0100000, 5, 0, 1)
%l0-3: 00000000c1680000 0000000000000031 0000030009ffcf98 00000000c1780000
%l4-7: 0000000000000001 0000000000000000 00000000fa176098 0000000000000000
000002a101670af0 unix:ktl0+64 (30026e95980, 0, 3001aee17d0, 30008355540, 3001aee1580, 7d03)
%l0-3: 0000030008c14000 0000000000000020 0000004411001606 000000000101ea40
%l4-7: 0000000070092f88 0000000000000001 0000000000000000 000002a101670ba0
000002a101670c40 genunix:vn_rele+58 (70092f88, 30026e94180, 0, 0, 1, 0)
%l0-3: 0000000000045f52 0000000000045f51 000003001caa1100 0000030026e94180
%l4-7: 0000000000004000 0000000000000000 000000007bfad7ec 0000000000000000
000002a101670d10 genunix:fop_open+158 (2a1016712c8, 3, 70092f88, 0, 0, 3)
%l0-3: 0000030026e94180 0000000000000000 0000000000000000 0000000000000000
%l4-7: 000003001b6526a0 0000030006c67940 0000000000002420 0000000000004000
000002a101670dc0 afs:osi_UfsOpen+148 (70092c48, 3000b523500, 3001aee1a10, 7d0e, 300024e1640, 0)
%l0-3: 0000000070092f88 0000000000000000 000003001aee1580 0000000001999000
%l4-7: 0000000000002000 0000000000000001 0000030008355540 0000000000000000
000002a1016712e0 afs:osi_UFSOpen+e4 (70092c48, 0, 0, 4000, 3000b523504, 3000b5234fc)
%l0-3: 0000000000000000 000000007008c000 000003001aee1580 0000030026e94180
%l4-7: 000000000000bb8b 000000000005c050 000000000005c04f 0000030007e68ee8
000002a1016713a0 afs:afs_InitCacheInfo+17c (1fff, 1fff, 3ff, 3ff, 7007c240, 70092)
%l0-3: 0000000000070092 0000000000070000 0000000070092000 0000000000070092
%l4-7: 0000000000070000 ffffffffffffe400 ffffffffffffe400 ffffffffffffffff
000002a101671500 afs:afs_syscall_call+108c (7, 58a48, 575dc, 58800, 0, 0)
%l0-3: 0000000000000007 000003001c919400 0000000000002000 0000000000000007
%l4-7: 00000300230c4f80 0000030007a11a50 0000000000000007 000000000199a7c8
000002a101671640 afs:Afs_syscall+84 (2a101671840, 2a101671840, 2a101671838, 1c, 70000001c, 0)
%l0-3: 0000000000000001 000000007008b000 0000030008283580 0000000000000000
%l4-7: 000000000196cd10 0000000000002000 0000000000400000 000000000199a9a4
000002a101671750 afs:devafs_ioctl+148 (15a00000000, ffffffff801c4302, ffbfc654, 100003, 0, 2a101671acc)
%l0-3: 0000000000000000 ffffffff801c4302 0000000000000001 000000007aaa3370
%l4-7: 000000007007cee8 000000007007cf70 00000300024fe000 0000000000000ad0
000002a1016718b0 genunix:fop_ioctl+c8 (3001caaec80, ffffffff801c4302, ffbfc654, 100003, 300144f85d0, 2a101671acc)
%l0-3: 0000000000000003 0000030009ffcf98 000003001481d7c0 0000000000000001
%l4-7: 0000000000000000 0000000000000000 ffffffff801c4302 0000000000000000
000002a101671970 genunix:ioctl+16c (3, ffffffff801c4302, ffbfc654, 3, ff2a9e74, ff29e1f4)
%l0-3: 000003000a4fd4c8 000000000000e89c 0000000000000003 0000030008c14000
%l4-7: 0000000000000003 0000000000000004 0000000000000000 0000000000000000

syncing file systems... done
dumping to /dev/zvol/dsk/rpool/dump, offset 65536, content: kernel
100% done: 78089 pages dumped, dump succeeded
rebooting...
Resetting...

My /etc/init.d/afs script is openafs-1.6.0/src/afsd/afs.rc.solaris.2.11.
I tried /usr/vice/cache within zfs and tmpfs, no differences.

I believe, I'm missing some required Solaris 11 packages to successfully
compile and run OpenAFS-1.6.0.

Any clues ?

If it helps, I can provide a buildslave.

Oh, btw. no MOS-contract here.

Greetings

Uli Pralle

ps: in 2008 we managed to compile and run OpenAFS-1.5.78 on Oracle Solaris 11
Express. An upgrade from Express to Solaris 11 1111 with "pkg update
--accept" succeeds, but the 1.5.78-kernel module panics Oracle Solaris 11 1111
instantly after afsd has been started.

pps: tried 1.7.1 in Oracle Solaris 11 Express: compiles and boots fine, but
stalls after some time, ls(1) shows duplicate directory entries.
Andrew Deason
2011-12-01 17:55:30 UTC
Permalink
On Thu, 1 Dec 2011 16:40:01 +0100 (CET)
Post by Ulrich Pralle
/opt/SUNWspro/bin/cc -I/export/openafs-1.6.0/src/config -I/export/openafs-1.6.0/include -I. -I. -dy -Bdynamic -c ./glue.c
"./glue.c", line 125: warning: implicit function declaration: _IOW
"./glue.c", line 125: syntax error before or at: struct
cc: acomp failed for ./glue.c
*** Error code 2
make: Fatal error: Command failed for target `glue.o'
Current working directory /export/openafs-1.6.0/src/sys
Should be fixed by
http://git.openafs.org/?p=openafs.git;a=commitdiff_plain;h=5a2081ee04edc4944157b908200c3996a962edd1

But you've already worked around it.
Post by Ulrich Pralle
Compilation then succeeds with some annoying warnings (and kdump.c
cannot be compiled). Loading the kernel module libafs.nonfs.o fails
with
unix: [ID 819705 kern.notice] /kernel/drv/sparcv9/afs: undefined symbol
unix: [ID 826211 kern.notice] 'afs_vcount'
unix: [ID 819705 kern.notice] /kernel/drv/sparcv9/afs: undefined symbol
unix: [ID 826211 kern.notice] 'afs_cacheStats'
unix: [ID 472681 kern.notice] WARNING: mod_load: cannot load module 'afs'
This I can't debug easily unless I have a system to fiddle with.
afs_vcount is unconditionally defined in afs_vcache.o, and if you don't
have afs_vcache.o you're missing a ton of other stuff, too. (Same goes
for afs_cacheStats/afs_dcache.o)
Post by Ulrich Pralle
After fiddling about with src/libafs/Makefile.common.in, the linkage
This is _all_ you changed? The only thing I can see that that does is
add optimization, which shouldn't have much effect. It's possible for
symbol references to be optimized away, but I don't see how the
reference in afs_daemons.o can be removed; we need that value.
Post by Ulrich Pralle
I believe, I'm missing some required Solaris 11 packages to
successfully compile and run OpenAFS-1.6.0.
No, you're probably just the first person to try running the client on
Solaris 11 sparc. I don't have any sparc boxes with 11 on them (though
that will probably change soon enough), so I can't really help you for
now unless you're offering me root access on such a machine of your own
that I can mess around with.
Post by Ulrich Pralle
Any clues ?
If it helps, I can provide a buildslave.
That would be a great help. This page has some information on how to get
started with that: <http://wiki.openafs.org/AFSLore/BuildbotSlaveHowto/>

The buildbot admin it mentions that you would want to contact I believe
is still Jason Edgecombe <***@rampaginggeek.com>. How fast would the
box be able to complete a full build of the OpenAFS tree?
--
Andrew Deason
***@sinenomine.net
Andrew Deason
2012-02-06 20:18:55 UTC
Permalink
On Thu, 1 Dec 2011 16:40:01 +0100 (CET)
Post by Ulrich Pralle
But sadly, the kernel module panics Solaris instantly after afsd -stat
panic[cpu4]/thread=3000b523500: BAD TRAP: type=31 rp=2a101670ba0 addr=98 mmu_fsr=0 occurred in module "genunix" due to a NULL pointer dereference
afsd: trap type = 0x31
addr=0x98
pid=20418, pc=0x11b8928, sp=0x2a101670441, tstate=0x4411001606, context=0x916
g1-g7: 30026e94180, 1, 1, 3001b6526a0, 0, 0, 3000b523500
We never managed to get more info out of this, but if you're still
interested, this may be fixed by this patch:
<http://git.openafs.org/?p=openafs.git;a=commitdiff_plain;h=ff81f39f26d85365256d0f821544a9af88c521d3>

That's not the version of the patch that will go in 1.6, but it should
work fine on Solaris 11. I can't be sure if that solves the issue or not
for you, but I saw a similar panic that turned out to be that issue.
--
Andrew Deason
***@sinenomine.net
Andrew Deason
2012-03-08 16:42:45 UTC
Permalink
On Thu, 1 Dec 2011 16:40:01 +0100 (CET)
Post by Ulrich Pralle
Compilation then succeeds with some annoying warnings (and kdump.c
cannot be compiled). Loading the kernel module libafs.nonfs.o fails
with
unix: [ID 819705 kern.notice] /kernel/drv/sparcv9/afs: undefined symbol
unix: [ID 826211 kern.notice] 'afs_vcount'
unix: [ID 819705 kern.notice] /kernel/drv/sparcv9/afs: undefined symbol
unix: [ID 826211 kern.notice] 'afs_cacheStats'
unix: [ID 472681 kern.notice] WARNING: mod_load: cannot load module 'afs'
...and if anyone was wondering about this, this appears to be due to a
sparc-specific Solaris Studio bug that is present in at least 12.2 but
not 12.3. Gerrit 6888 has a workaround for the openafs side.
--
Andrew Deason
***@sinenomine.net
Derrick Brashear
2012-03-08 17:07:53 UTC
Permalink
Somehow I missed the first message in the thread (but it's in my
inbox, so I dunno).

Thanks for fixing it.
Post by Andrew Deason
On Thu, 1 Dec 2011 16:40:01 +0100 (CET)
Post by Ulrich Pralle
Compilation then succeeds with some annoying warnings (and kdump.c
cannot be compiled).  Loading the kernel module libafs.nonfs.o fails
with
 unix: [ID 819705 kern.notice] /kernel/drv/sparcv9/afs: undefined symbol
 unix: [ID 826211 kern.notice]  'afs_vcount'
 unix: [ID 819705 kern.notice] /kernel/drv/sparcv9/afs: undefined symbol
 unix: [ID 826211 kern.notice]  'afs_cacheStats'
 unix: [ID 472681 kern.notice] WARNING: mod_load: cannot load module 'afs'
...and if anyone was wondering about this, this appears to be due to a
sparc-specific Solaris Studio bug that is present in at least 12.2 but
not 12.3. Gerrit 6888 has a workaround for the openafs side.
--
Andrew Deason
_______________________________________________
Port-solaris mailing list
http://lists.openafs.org/mailman/listinfo/port-solaris
--
Derrick
Loading...