irc.oftc.net #zumastor log beginning Fri Feb 1 00:00:01 PST 2008 2008-02-01 00:10 -!- charlesn1(~charles@cpe-75-84-92-236.socal.res.rr.com) has joined #zumastor 2008-02-01 00:16 -!- charlesnw(~charles@cpe-75-84-92-236.socal.res.rr.com) has joined #zumastor 2008-02-01 01:02 -!- charlesnw(~charles@cpe-75-84-92-236.socal.res.rr.com) has joined #zumastor 2008-02-01 01:19 -!- charlesnw(~charles@cpe-75-84-92-236.socal.res.rr.com) has joined #zumastor 2008-02-01 02:00 -!- juuva(juuva@peili.org) has joined #zumastor 2008-02-01 02:43 flips: nice, but your upload is too slow for such a huge image 2008-02-01 02:44 apt-get install imagemagick ; convert -scale x600 fit.jpg 2008-02-01 02:44 I knew it 2008-02-01 02:44 already installed, just me being lazy 2008-02-01 02:44 how long did it take? 2008-02-01 02:44 too long 2008-02-01 02:45 40kB/s 2008-02-01 02:45 so just over a minute or so 2008-02-01 02:46 jpg could be a lot smaller just with compression 2008-02-01 02:46 its probably straight off the camera 2008-02-01 02:46 and out of focus, boo 2008-02-01 02:46 ok, converted 2008-02-01 02:46 depth of field, not out of focus ;) 2008-02-01 02:47 there is one corner of the circuit board in the middle that is in focus 2008-02-01 02:47 nah its not 2008-02-01 02:47 zoom all the way in 2008-02-01 02:47 blurry 2008-02-01 02:48 might be the lens, I had to put on the old kit lens otherwise the flash threw a shadow 2008-02-01 02:48 but hmm i guess its not out of focus since the foreground and background are both more blurry 2008-02-01 02:48 canons really dont shoot very sharp 2008-02-01 02:48 :P 2008-02-01 02:48 -!- charlesnw(~charles@cpe-75-84-92-236.socal.res.rr.com) has joined #zumastor 2008-02-01 02:49 how'd you know I'd be up? :p 2008-02-01 02:49 flips: much better 2008-02-01 02:49 :) 2008-02-01 02:50 decorating my ddmap patch right now 2008-02-01 02:50 did you notice that when you try to create a device that already exists, device mapper returns EBUSY? 2008-02-01 02:50 no 2008-02-01 02:51 I also found that if you have zero length targets, dm does not check until it tries to install the table 2008-02-01 02:51 actually, tries to resume it 2008-02-01 02:51 it will install happily 2008-02-01 03:13 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-01 07:38 -!- charlesnw(~charles@dctm.siderean.com) has joined #zumastor 2008-02-01 08:58 -!- Tim_vimm(~Tim@206-171-55-69.ded.pacbell.net) has joined #zumastor 2008-02-01 09:38 Microhooey 2008-02-01 09:39 Microwho <- just trying to help 2008-02-01 11:49 so in my humble opinion, working on ddsnap.c is unpleasant 2008-02-01 11:49 and that is largely my fault 2008-02-01 11:49 for not being more aggressive about cleaning stuff up 2008-02-01 11:51 -!- fmayhar(~fmayhar@207.47.98.129.static.nextweb.net) has left #zumastor 2008-02-01 11:51 kernel-style error handling where negative error numbers are returned instead of returning -1 and having the error number in errno is a good idea for ddsnapd which may be ported to kernel one day, but is a bad idea for ddsnap.c where it conflicts with libc usage 2008-02-01 11:51 that's my fault 2008-02-01 11:52 I think it was perceived as kind of cool in ddsnapd (it is, but...) and got carried into ddsnap.c in the general fog of hacking 2008-02-01 12:00 so, ok, most ddsnap requests expect one and only one valid response, so there is going to be a function expect(sock, CODE, &size) that just reads the next message head, returns zero and fills in size if it is the expected message, otherwise handles the error generically 2008-02-01 12:01 this should be good for lots of lines of cruft removal 2008-02-01 12:02 /ping shapor, jiayingz 2008-02-01 12:05 the other huge annoyance in there is libpopt 2008-02-01 12:05 don't know what to do about that yet 2008-02-01 12:08 specifically re reading changlists, there is a struct cl_head stuct onto the beginning of the reply that appears to serve no purpose 2008-02-01 12:08 stuct -> stuck 2008-02-01 12:10 for now I will just omit it in the case of returning a delta against origin, the only useful field in there is chunk size which should be returned by a separate status enquiry 2008-02-01 12:11 ...or go with the flow, read the damn struct and put it on the cruft removal list for later 2008-02-01 12:11 thus becoming part of the problem 2008-02-01 12:18 transitional code something like this until more of the error returns are converted to errno style: 2008-02-01 12:18 err: 2008-02-01 12:18 errno = -err; 2008-02-01 12:18 fail: 2008-02-01 12:18 return fail(); 2008-02-01 13:23 -!- jiayingz(~jiayingz@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-01 13:52 pong 2008-02-01 14:47 -!- nataliep(~nataliep@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-01 15:24 -!- pgquiles(~pgquiles@157.red-81-44-62.dynamicip.rima-tde.net) has joined #zumastor 2008-02-01 16:00 -!- charlesnw(~charles@dctm.siderean.com) has left #zumastor 2008-02-01 16:30 -!- nataliep(~nataliep@cpe-76-94-49-21.socal.res.rr.com) has joined #zumastor 2008-02-01 21:14 installing kde on the fit pc now 2008-02-01 21:14 frozen-bubble on the fit meets dana's approval 2008-02-01 21:14 got to try quake 2 pretty soon 2008-02-01 21:15 then zumastor of course 2008-02-01 21:16 it gets quite warm when I place it on top of the wrt11g 2008-02-01 21:16 better add some spacers to the stack 2008-02-01 21:24 Im having a hell of a time trying to roll in the mv_sata driver 2008-02-01 21:25 mv? 2008-02-01 21:25 marvell 2008-02-01 21:25 right 2008-02-01 21:25 I sent mail with an update 2008-02-01 21:26 roll in means just configing it on, or applying a patch? 2008-02-01 21:26 or adding the module to initrd? 2008-02-01 21:26 Trying to build the module against 2.6.21.1 2008-02-01 21:26 ACTION looks for the mail 2008-02-01 21:27 we have a port to 2.6.23.13 now, jiaying has it 2008-02-01 21:27 well 2008-02-01 21:27 one big kernel patch 2008-02-01 21:27 actually, she just edited the kernel tree 2008-02-01 21:28 http://pastebin.com/ma016644 2008-02-01 21:28 I tried 2.6.24 aswell 2008-02-01 21:29 http://pastebin.com/d954909f - 2.6.24 2008-02-01 21:30 (thats vanilla) 2008-02-01 21:30 just a sec 2008-02-01 21:30 INIT_WORK is scheduler skew 2008-02-01 21:31 you can send your fan mail to mingo ;) 2008-02-01 21:31 heh 2008-02-01 21:31 just checking to see what changed 2008-02-01 21:32 why 2.6.21.1 instead of 2.6.23.8? 2008-02-01 21:32 its got a wonky licence 2008-02-01 21:32 how wonky? 2008-02-01 21:32 http://zumastor.googlecode.com/svn/trunk/ddsnap/INSTALL 2008-02-01 21:32 and 2008-02-01 21:33 compbrain.net/tmp/wonky.txt 2008-02-01 21:34 ooh, sun+binary modules 2008-02-01 21:34 = evil 2008-02-01 21:34 its the same as vendr supplied I think 2008-02-01 21:35 probably we should apply some pressure to sun over this 2008-02-01 21:36 binary modules are in a gey area legally 2008-02-01 21:36 some copyright holders in the kernel to not agree they are legal 2008-02-01 21:36 anyway 2008-02-01 21:36 which kernel versions are known-good? 2008-02-01 21:37 I don't immediately see what is wonky about the license 2008-02-01 21:37 an old driver verson: 2008-02-01 21:37 - RedHat AS v3 (Kernel version 2.4.21-9.EL) 2008-02-01 21:37 - Fedora Core2 (Kernel version 2.6.5-1.327) 2008-02-01 21:37 - Suse Linux 9.1 (Kernel version 2.6.4-52) 2008-02-01 21:38 2.6.5 is nigh on prehistoric 2008-02-01 21:39 http://lxr.linux.no/linux/include/linux/workqueue.h#L79 <- recent INIT_WORK 2008-02-01 21:40 http://lxr.linux.no/linux-bk+v2.6.5/include/linux/workqueue.h#L44 <- 2.6.5 INIT_WORK 2008-02-01 21:41 the version im playing with lists 2.6.15-1.2054 and 2.6.9-1.667 2008-02-01 21:42 we could backport if necessary 2008-02-01 21:43 but doing that just to help sun be evil would be annoying 2008-02-01 21:43 we don't actually rely on any recent kernel features, just have to adjust for minor skew 2008-02-01 21:43 So, if the softlockup bugs are still around with this driver, there may be no advantage to using it 2008-02-01 21:44 you saw softlockup with the open source driver I presume? 2008-02-01 21:44 Yea 2008-02-01 21:45 i've had to link autogen.h to config.h 2008-02-01 21:46 https://lists.linux-foundation.org/pipermail/bugme-new/2005-November/013334.html <- having a looksee for mv_sata bug activity 2008-02-01 21:47 that looks fun 2008-02-01 21:49 unless we can get that module built and an amd64 build, we can't test this box 2008-02-01 21:49 bugzilla search sucks ass 2008-02-01 21:49 we'll work it out one way or another 2008-02-01 21:49 s/ search// 2008-02-01 21:50 arguably 2008-02-01 21:52 going back to figuring out the INIT_WORK change 2008-02-01 21:52 http://lwn.net/Articles/211279/ Workqueues get a rework 2008-02-01 21:53 dhowells turns out to be the culprit. made the work queue struct smaller by changing the api 2008-02-01 21:54 ok, so you need to know whether the work queue you have is using the delayed work feature or not 2008-02-01 21:54 most do not 2008-02-01 21:55 if there is no delay, then just delete the data parameters 2008-02-01 21:57 make sure that mvLinuxIalLib.c has #include 2008-02-01 21:58 #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0) #include 2008-02-01 21:58 #endif 2008-02-01 21:58 then check workqueue.h and see if it defines INIT_WORK 2008-02-01 21:59 also, you can put some garbage in workqueue.h just to make sure the file is getting compiled 2008-02-01 21:59 the conditional looks right, but... 2008-02-01 22:02 i wonder what Linux_3_6_3 is 2008-02-01 22:02 3.6.3 2008-02-01 22:02 driver version 2008-02-01 22:03 wow, google totally fails to index mvSata_Linux_3_6_3 2008-02-01 22:03 Thats unpacked from mvSatalinux-3.6.3_2-1.src.rpm 2008-02-01 22:04 shows how often fedora people get out on the web 2008-02-01 22:04 unpacked from X4500_Tools_And_Drivers_linux_28521a.tar.bz2 2008-02-01 22:04 downloaded from Sun's login-requred download page 2008-02-01 22:04 got to ping sun about this 2008-02-01 22:05 send in a slashdot article ;) 2008-02-01 22:05 "sun outed as actually hating linux" :) 2008-02-01 22:06 I should ping niall and co and see what they had to do to get it integrated in the first place 2008-02-01 22:06 could you email me the tarball? 2008-02-01 22:06 which 2008-02-01 22:06 goog is fine 2008-02-01 22:06 or phunq.net 2008-02-01 22:06 http://compbrain.net/tmp/mvSatalinux-3.6.3_2-src.tar.gz ? 2008-02-01 22:07 I thought you were trying to get the sun one going? 2008-02-01 22:07 both then 2008-02-01 22:07 X4500_Tools_And_Drivers_linux_28521a.tar.bz2 2008-02-01 22:07 That is the Sun one, thats extracted from their src rpm 2008-02-01 22:07 ok, I'd like to have a boo at both 2008-02-01 22:08 phunq.net mail generally gets to me faster than goog 2008-02-01 22:08 there does seem to be a whole lot of lack of interest in this driver out there 2008-02-01 22:09 Its the only driver supporting hot swap and such on that machine 2008-02-01 22:09 the sun one or both? 2008-02-01 22:09 the sun one includes binary stuff or not? 2008-02-01 22:12 MV_BOOLEAN <- yuckola 2008-02-01 22:14 oh good, konsole has appeared on the fit's gnome menus 2008-02-01 22:17 yummy, konsole is up 2008-02-01 22:18 binary stuff? 2008-02-01 22:19 is there a binary blob in the sun driver or is it all source code? 2008-02-01 22:21 fast guys say we are welcome to do a wip session there 2008-02-01 22:21 even though past the deadline 2008-02-01 22:21 need to confirm that and figure out what needs doing 2008-02-01 22:29 willn, did you already send the drivers? 2008-02-01 22:29 don't see anything in my inbox 2008-02-01 22:33 I cant emailthat big tools and drivrs blog 2008-02-01 22:33 grabbing http://compbrain.net/tmp/mvSatalinux-3.6.3_2-src.tar.gz 2008-02-01 22:33 just as you said 2008-02-01 22:33 its 70M 2008-02-01 22:34 how can anybody make a 70M driver 2008-02-01 22:34 :P 2008-02-01 22:34 does it include several first person shooters? 2008-02-01 22:34 and a couple dozen pdfs? 2008-02-01 22:35 its tools and drivers 2008-02-01 22:35 binaries and souce 2008-02-01 22:35 same comment applies 2008-02-01 22:35 70M, unfsckingbelivable 2008-02-01 22:35 for several different linux distros 2008-02-01 22:35 and several differnet version 2008-02-01 22:35 still 2008-02-01 22:36 the source is slightly less than 1M, for just the SATA driver 2008-02-01 22:36 the bundle included all drivers for the system 2008-02-01 22:36 wow 2008-02-01 22:36 when did people forget how to program 2008-02-01 22:37 is http://compbrain.net/tmp/mvSatalinux-3.6.3_2-src.tar.gz part of that package? 2008-02-01 22:39 I really do not see anything dogy about license.txt 2008-02-01 22:39 am I missing something 2008-02-01 22:41 "6. Disclaimer 2008-02-01 22:41 ============= 2008-02-01 22:41 No part of this document may be reproduced or transmitted in any form or by any means, 2008-02-01 22:41 electronic or mechanical, including photocopying and recording, for any purpose, without 2008-02-01 22:41 the express written permission of Marvell." 2008-02-01 22:41 weird 2008-02-01 22:43 Im going to go crash, maybe I can poke at some of this tommorow 2008-02-01 22:47 see you 2008-02-01 22:47 I'm having a look through that driver tree, is it the one that supports the good stuff? 2008-02-01 22:47 mvSatalinux-3.6.3_2-src.tar.gz 2008-02-01 22:56 well it's going to be a fair sized job, but the thing to do is just turn this source hairball into a proper kernel patch, compile it for the kernel version it actually builds on, then start forward porting 2008-02-01 22:57 then invite whoever to submit it for merging or send it up ourselves 2008-02-01 23:02 ok, the INIT_WORK interface changed in 2.6.20 2008-02-01 23:03 so 2.6.19.something would be a good place to start 2008-02-01 23:04 I happen to have a 2.6.19.1 tree lying around, so start there 2008-02-01 23:22 kde 3.5 is snappy on the fit... _much_ snappier than gnome 2008-02-01 23:23 so much for C being faster than C++ irc.oftc.net #zumastor log beginning Sat Feb 2 00:00:01 PST 2008 2008-02-02 00:23 installing firefox 3.0 on the fit 2008-02-02 00:23 this thing is sweet 2008-02-02 00:23 sound works out of the box 2008-02-02 00:23 ...which every web server needs 2008-02-02 00:26 it just volunteered to install itself as default browser replacing konq, lets see how much kde that breaks 2008-02-02 00:26 no complaints about the browsing speed 2008-02-02 02:15 kde4 installing now 2008-02-02 03:36 running kde4 now. Pretty progress bar. 2008-02-02 03:37 very pretty 2008-02-02 03:43 lotsa bugs 2008-02-02 07:20 4.0.1 will be released in a week, hopefully fixing quite some bugs 2008-02-02 11:33 meanwhile I am back to 3.5 which is really nice 2008-02-02 11:33 the fit plays videos surprisingly well 2008-02-02 11:34 shockwave performance sucks, shows how open source is superior 2008-02-02 11:56 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-02 15:50 -!- dank(~chatzilla@cpe-76-90-56-73.socal.res.rr.com) has joined #zumastor 2008-02-02 15:51 ping 2008-02-02 16:12 I havent seen any more softlockups since the resync stopped, ill have to load it up with more data and see what happens 2008-02-02 16:31 hi dank 2008-02-02 17:32 Do we support XFS or the like? 2008-02-02 17:40 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-02 17:48 willn, we do 2008-02-02 17:49 though we haven't tried it much 2008-02-02 17:49 Drake has, a little 2008-02-02 18:54 ok, groovy irc.oftc.net #zumastor log beginning Sun Feb 3 00:00:01 PST 2008 2008-02-03 04:34 -!- dank(~chatzilla@cpe-76-90-56-73.socal.res.rr.com) has joined #zumastor 2008-02-03 04:35 Moin 2008-02-03 05:27 Back to bed... http://zumastor.googlecode.com/svn/trunk/doc/zumastor-howto.html now has download instructions for 0.6... working to get ready for Monday's 0.6 release. 2008-02-03 11:30 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-03 12:39 -!- dank(~chatzilla@cpe-76-90-56-73.socal.res.rr.com) has joined #zumastor 2008-02-03 12:42 lunchtime! 2008-02-03 12:51 I've uploaded 0.6 to zumastor.org and updated the howto to recommend downloading it. 2008-02-03 21:44 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-03 23:46 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor irc.oftc.net #zumastor log beginning Mon Feb 4 00:00:01 PST 2008 2008-02-04 02:00 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-04 02:07 -!- pgquiles__(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-04 03:23 -!- erwan_taf(~erwan@81.80.43.67) has joined #zumastor 2008-02-04 03:31 -!- jelly(~jelly@lan.iskon.hr) has joined #zumastor 2008-02-04 04:14 -!- dank(~chatzilla@cpe-76-90-56-73.socal.res.rr.com) has joined #zumastor 2008-02-04 05:56 -!- pgquiles__(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-04 06:01 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-04 07:54 -!- charlesnw(~charles@ses.siderean.com) has joined #zumastor 2008-02-04 09:41 -!- dank(~chatzilla@cpe-76-90-56-73.socal.res.rr.com) has joined #zumastor 2008-02-04 11:53 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-04 12:51 can I run replication with a non-root user? 2008-02-04 13:08 -!- nataliep(~nataliep@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-04 13:27 no. I think the only command you can run as a non-root user is 'zumastor status' 2008-02-04 13:28 jiayingz: I think I did not ask the right question, let me rephrase it 2008-02-04 13:28 can the zumastor script be rewritten to run replication (and probably other commands) as a non-root user? 2008-02-04 13:28 (I'm a bit worried about phraseless ssh access for root) 2008-02-04 13:42 the ssh access is a problem. eventually we would like to use other ways to communicate between zumastor nodes, instead of using ssh 2008-02-04 13:43 but we may only want root users to be able to run zumastor commands 2008-02-04 13:43 the problem remains the same: keys or authentication are needed to exchange deltas 2008-02-04 13:44 I was thinking about that today and I think a quick fix for now would be: 2008-02-04 13:44 1) make /bin/zumastor check (i. e. whoami) if it's being run by root; if not, abort 2008-02-04 13:45 2) run replication through ssh but with a different user and not allowing login with that user 2008-02-04 13:46 of course point 2 requires some changes because zumastor invokes needs a shell in some cases (ssh remotehost command) 2008-02-04 13:46 does your quick fix require both solutions, or just one of them? 2008-02-04 13:46 not "require" but "wish" :-) 2008-02-04 13:46 point 1 is trivial 2008-02-04 13:46 point 2 requires some work because of running commands on the target host 2008-02-04 13:47 the problem is 'zumastor replication' will change zumastor database, so the questions is if we want non-root user to do that 2008-02-04 13:47 oh, I see 2008-02-04 13:47 we tend to consider it as administrator's task 2008-02-04 13:48 well, we could have a "runas = theuser" in zumastor.conf, which could default to root 2008-02-04 13:48 I mean, that 'theuser' would be a user for administrative tasks only, with disabled login 2008-02-04 13:48 just like ftp, samba, etc do 2008-02-04 13:48 so if a non-root user want to take 'zumastor snapshot' or 'zumastor replication', we may want to allow users to define where zumastor configuration files should locate 2008-02-04 13:49 no 2008-02-04 13:49 you just don't allow that 2008-02-04 13:49 'sudo zumator snapshot' is the way to go 2008-02-04 13:49 that other user is only to avoid having to have a phraseless root key 2008-02-04 13:50 and that user would not be allowed to login to any machine, only to pipe deltas through ssh 2008-02-04 13:50 the lamest transport protocol in the world, by pgquiles(tm) :-) 2008-02-04 13:50 i c your point 2008-02-04 13:51 that is possible 2008-02-04 13:52 just like the administration user created for mysql, etc. 2008-02-04 13:52 exactly 2008-02-04 13:53 that should be easy to do 2008-02-04 13:53 could you create an issue for the request? 2008-02-04 13:53 or I can do it, if you would like 2008-02-04 13:54 I can do it 2008-02-04 13:54 to add insult to injury, I can devise a lame RMI "protocol" to quickly fix those "ssh remotehost command" 2008-02-04 13:55 you just write a "zumastor_pending_commands.sh" to the remote host, and the remote host would monitor this file and execute it when it's available 2008-02-04 13:57 if you want a non-root user to run 'zumastor replicate', you may need to change the ownership of /var/run/zumastor and /var/lib/zumastor 2008-02-04 13:57 k 2008-02-04 13:57 ok 2008-02-04 13:57 because replication needs to read/write to files in those directories 2008-02-04 13:58 there are already issues for what we've talked about: 40 and 51 2008-02-04 13:58 fix 51 (whoami != root => go to hell) is trivial 2008-02-04 13:59 ah, i saw the issue 2008-02-04 13:59 should I write my proposals for 40 and 51 in the issues' comments, or should I send an e-mail to the m-l for discussion? 2008-02-04 14:02 either way you would like. I prefer to issues' comments, but I know DanP doesn't like issue tracker :) 2008-02-04 14:03 :-) 2008-02-04 14:04 I'll write to the issue tracker, then ping the mailing list 2008-02-04 14:04 that will be perfect :) 2008-02-04 14:05 I have thought about this issue before, and if you ask me, whatever you go with, it is hard to be more secure than a phraseless key and SSH 2008-02-04 14:06 The only problem is tying it to root. It'd be better to have the login be to a "zumastor" account. 2008-02-04 14:07 yes. that should be easy to do. I think all we need to do is to change the ownership of /var/run/zumastor and /var/lib/zumastor 2008-02-04 14:07 jiayingz: And if it proves hard to do, a little bit of setuid is probably preferable to the alternative. 2008-02-04 14:07 what about logins? how to get rid of that? or at least limiting that to the minimum, for example, logging to a chroot with no real access to the system 2008-02-04 14:09 that is possible 2008-02-04 14:09 pgquiles: Yeah, some kind of restricted shell makes sense. I'm not a huge fan of chroot's for security, but something like making the zumastor script the login shell might be a good idea. 2008-02-04 14:10 cbsmith: I'm no fan of chroot either. I'd like it better if Zumastor could work with no need to log into the remote hosts. 2008-02-04 14:11 dropping a shellscript to be run is awful but at least you make sure only hosts in the "zumastor mesh" can send files to each other 2008-02-04 14:11 no third party, I mean 2008-02-04 14:12 then we need to have our own kerberos authentication 2008-02-04 14:12 pgquiles: Yeah, but it is kind of a Greenspun's 10th Law kind of thing: any alternative solution you come up with will probably end up being a buggy, less secure version of ssh. ;-) The only redeeming factor is that you end up with a simpler protocol. 2008-02-04 14:12 :-D 2008-02-04 14:14 cbsmith: and if you rephrase my "dropping a shameful shell-script.sh into the remote host" as "injecting an instruction token into a peer in the replication mesh" it sound like something serious and well thought! 2008-02-04 14:14 jiayingz: We could kerborize the service, but requiring kerberos is worse than requiring ssh 2008-02-04 14:15 requiring kerberos would probably need more than a few hours, although that would probably be the best solution 2008-02-04 14:16 I think using a different user and, if possible, disallowing login for that user, would be more than enough for now 2008-02-04 14:16 there are many storage-related issues still open and more important than kerbers, IMHO 2008-02-04 14:17 pgquiles: exactly 2008-02-04 14:25 -!- daniel__(~phlipz@phunq.net) has joined #zumastor 2008-02-04 14:28 cbsmith: jiayingz: I've sent the relevant part of the log to the m-l 2008-02-04 14:28 pgquiles: i noticed 2008-02-04 14:30 yes. I saw that 2008-02-04 14:30 thanks for reporting that 2008-02-04 14:31 btw, it'd be nice to have a verbose mode in /bin/zumastor to know what step has failed 2008-02-04 14:32 bash -x 2008-02-04 14:32 I did that :-) 2008-02-04 14:32 :D 2008-02-04 14:32 but it's not nice either 2008-02-04 14:32 I mean some kind of "trying to ssh connect to remote host...", etc 2008-02-04 14:32 so something like showing the errors from log files 2008-02-04 14:33 not that verbose, either :-) 2008-02-04 14:33 a short sentence which guides you 2008-02-04 14:33 we actually log most error messages, but the problem is only developers know where to get them 2008-02-04 14:34 for example, first time I tried to setup replication, I received a strange error which I could recognize as ssh-related but wondered why 2008-02-04 14:34 not exactly intuitive -) 2008-02-04 14:35 i c. we do need to spend more time on that. they are not hard, just need time 2008-02-04 14:35 I can provide patches, if you want 2008-02-04 14:36 sure we want :) 2008-02-04 14:37 any patch from the public is greatly appreciated 2008-02-04 14:39 fine, I'll begin with the 1-minute root-checking in /bin/zumastor, meaningful errors and verbose mode 2008-02-04 14:39 it's 23.40 here, so that's for tomorrow :-) 2008-02-04 14:40 btw, I created a 3TB volume with a 3.5TB snapshot volume 2008-02-04 14:40 in two servers 2008-02-04 14:41 I was trying to setup replication when I found about root ssh, etc 2008-02-04 14:41 last week we had some serious power issues here and I couldn't to any real test 2008-02-04 14:43 so the root ssh is the blocker for your setup? 2008-02-04 14:44 not, not blocker 2008-02-04 14:44 but my mind would rest more comfortably :-) 2008-02-04 14:45 ok ;). I was wondering how serious the root ssh problem is 2008-02-04 14:45 I guess it depends on where you are going to use zumastor 2008-02-04 14:46 in my case, the company I work for is small and we have no kevin mitnick here :-) 2008-02-04 14:46 right, it is of course a serious problem 2008-02-04 14:48 another approach I could take, and it'll be really easy for me but it's not a real solution, is to use the second ethernet in the servers, setup a different VLAN in the switches, then perform replication through that port 2008-02-04 14:50 do you have kerberos authentication setup in your site? 2008-02-04 14:50 no 2008-02-04 14:51 so if we switch to kerberos, there may cause some problem to deploy zumastor 2008-02-04 14:51 we have Active Directory but I don't think you can trust that implementation :-) 2008-02-04 14:52 kerberos wouldn't be a problem for me but what about NAS? how resource intensive is a kerberos daemon? 2008-02-04 14:52 the last NAS I bought (and I really mean the *last one*) was a Linksys with an ARM processor and 32MB of RAM 2008-02-04 14:52 and that was the most powerful model 2008-02-04 14:52 'resource intensive' means cpu and memory, or administration? 2008-02-04 14:53 cpu and memory 2008-02-04 14:54 otoh, the current ssh also has some impact, I don't know what size the key is 2008-02-04 14:55 shouldn't be much. but I am not quite sure 2008-02-04 14:55 sorry I have to go 2008-02-04 14:55 talk to u later 2008-02-04 14:55 a few weeks ago I tried to generate a small key to fix the apache+webdav+windows nightmare and ubuntu wouldn't allow < 512 bits 2008-02-04 14:55 ok, see you 2008-02-04 16:38 -!- nataliep(~nataliep@cpe-76-94-49-21.socal.res.rr.com) has joined #zumastor 2008-02-04 17:07 -!- charlesnw(~charles@ses.siderean.com) has left #zumastor 2008-02-04 17:37 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor irc.oftc.net #zumastor log beginning Tue Feb 5 00:00:01 PST 2008 2008-02-05 00:57 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-05 07:24 -!- daniel__(~phlipz@12.34.85.2) has joined #zumastor 2008-02-05 07:47 -!- daniel__(~phlipz@12.34.85.2) has joined #zumastor 2008-02-05 07:58 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-05 08:22 -!- dank(~chatzilla@cpe-76-90-56-73.socal.res.rr.com) has joined #zumastor 2008-02-05 08:26 Howdy... 2008-02-05 08:26 Any comments on the simplification I did on http://zumastor.org ? 2008-02-05 08:30 dank: I like it 2008-02-05 08:30 -!- daniel__(~phlipz@12.34.85.2) has joined #zumastor 2008-02-05 08:32 tnx. 2008-02-05 08:32 I agree it sure would be nice to get rid of ssh, but IMHO getting multilayer replication working etc. is higher priority... the ssh stuff is just plumbing. 2008-02-05 08:33 (I have ALWAYS disliked our use of ssh, but I figured it was just a prototype.) 2008-02-05 08:33 (I have a design for getting rid of it, just no time/manpower to implement yet.) 2008-02-05 09:06 -!- daniel__(~phlipz@12.34.85.2) has joined #zumastor 2008-02-05 10:02 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-05 13:07 mmm so I started to implement this --verbose mode in zumastor but it seems it will break everything :-( 2008-02-05 13:30 is there any way to tell bash that '=' should take the last echo'ed value in a function (like Ruby) instead of the first one? :-? 2008-02-05 13:31 head about stderr? 2008-02-05 13:31 heard* 2008-02-05 13:33 mmm good idea, I had not thought of it :-) 2008-02-05 13:37 shell debug output goes to stderr, especially when writing debian install scripts; they do horrible things with fds. :-) 2008-02-05 14:23 pgquiles: Wait.. what do you mean by --versbose mode? 2008-02-05 14:26 cbsmith: it shows the steps zumastor takes for each command, for example 'zumastor define volume ...' involves calling several functions. Sometimes it's useful to know when zumastor fails and why rather than seeing a cryptic error, which usually is just the output of some command zumastor uses but the user not necessarily knows it's being used (for example, ssh) 2008-02-05 14:26 pgquiles: Ah, that one. Cool. Thanks for working on it. 2008-02-05 14:34 -!- charlesnw(~charles@ses.siderean.com) has joined #zumastor 2008-02-05 16:17 -!- Marcin(~chatzilla@c-76-23-112-26.hsd1.sc.comcast.net) has joined #zumastor 2008-02-05 18:10 -!- charlesnw(~charles@ses.siderean.com) has left #zumastor 2008-02-05 20:10 -!- flips(~phillips@phunq.net) has joined #zumastor 2008-02-05 20:10 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-05 22:49 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor irc.oftc.net #zumastor log beginning Wed Feb 6 00:00:01 PST 2008 2008-02-06 03:02 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-06 05:46 -!- jelly(~jelly@lan.iskon.hr) has joined #zumastor 2008-02-06 07:38 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-06 10:18 -!- dkegel(~chatzilla@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-06 10:18 I wonder how the 0.6 release party is going... 2008-02-06 11:57 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-06 12:00 dkegel: ping 2008-02-06 13:33 -!- cbsmith(~xman@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-06 13:37 pgquiles: pong 2008-02-06 13:38 dkegel: what's your proposal to get rid of ssh for replication? 2008-02-06 13:47 Something like this: a connection manager that is a separate userspace daemon that listens on a Unix domain socket for connection requests, and returns file descriptors over the socket. It would hide all the details of secure connection establishment. 2008-02-06 13:47 In other words, factor the problem out of zumastor. 2008-02-06 13:50 This is completely separate from Dan's proposal to run as unprivileged user 'zumastor', which sounds like a fine idea, too. 2008-02-06 13:52 dkegel: ssh + unprivileged user was my proposal, actually, but there's still the problem of logins 2008-02-06 13:52 I'd like it better if there was no need for logins 2008-02-06 13:52 Yes, that's what my part is for. 2008-02-06 13:52 or at least, if it could work with a restricted login (rbash) 2008-02-06 13:53 Shapor was going to write it with plain sockets originally, so we all kind of want it to happen. 2008-02-06 13:53 pgquiles: Well... you don't want just anyone messing with the snapshot store though, so you do need some level of authentication. 2008-02-06 13:54 Yeah. My proposal would let the admin select the level of authentication (kind of like a very simpleminded pam). 2008-02-06 13:54 cbsmith: of course, authentication is a need 2008-02-06 13:55 except for admins who just want to set it up without authentication for debugging... 2008-02-06 13:57 my proposal of a non-login, lame protocol which copies files ("cookies") via scp and authenticates with ssh keys would work fine, I think 2008-02-06 13:57 for example: 2008-02-06 13:57 run_remote $host "zumastor receive start $remote_vol $port" > $remote_file 2008-02-06 13:57 I really don't want to do a remote execution protocol. 2008-02-06 13:58 I'd rather have a better defined protocol that actually is specific to zumastor. 2008-02-06 13:58 so you like a pure data transport protocolo better? 2008-02-06 14:00 Yes. The connection manager would handle any authentication protocol, then provide plain old byte streams; on top of that, zumastor would have a real protocol (which might look a lot like the commands we currently use). 2008-02-06 14:02 The difference is small. It's mainly that we would have our own daemon listening for connections rather than sshd, and there would be fewer forks involved. 2008-02-06 14:02 And no /bin/login. 2008-02-06 14:03 fine with me, after all, you guys are the ones coding :-P 2008-02-06 14:03 Or not coding, as the case may be. The connection manager idea has been sitting there for a long time. 2008-02-06 14:03 I recall having read this morning (-14 hours :-) some sort of "pipe" like the one you describe 2008-02-06 14:05 The world is full of similar pipes... 2008-02-06 14:13 -!- cbsmith(~xman@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-06 14:36 -!- kfortin(~chatzilla@smtp.violintech.net) has joined #zumastor 2008-02-06 15:09 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-06 15:38 hi all 2008-02-06 15:39 hi 2008-02-06 15:39 howdy 2008-02-06 15:55 flipz: ping 2008-02-06 16:13 -!- cbsmith(~xman@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-06 20:50 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-06 22:00 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-06 22:59 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-06 23:28 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-06 23:31 my power meter arrived 2008-02-06 23:31 so... the fit PC consumes 6 watts ac 2008-02-06 23:34 flips: fit pc? 2008-02-06 23:35 yes 2008-02-06 23:35 little pc 2008-02-06 23:35 runs on 5 watts, and can run kde complete with ooffice and movie playing 2008-02-06 23:35 nice 2008-02-06 23:36 some trolltech guy was blogging yesterday about getting kde4 to run on a gumstix :-) 2008-02-06 23:36 it's to be my web server, I think I will use zumastor to replicate its web files just for fun 2008-02-06 23:37 gumstix would be much cooler indeed 2008-02-06 23:37 flips: btw, according to http://zumastor.googlecode.com/svn/trunk/doc/install.html#user I should not create a filesystem after defining the volume in the target. Is that right? 2008-02-06 23:37 s/should not/do not need 2008-02-06 23:37 root@lab2:~# zumastor define volume testvol /dev/sysvg/test /dev/sysvg/test_snap --initialize 2008-02-06 23:37 root@lab2:~# zumastor define source testvol lab1.example.com --period 600 2008-02-06 23:38 true, you do not need to create the filesystem new 2008-02-06 23:38 you can snapshot an existing volume 2008-02-06 23:39 snapshot?I was talking about replication 2008-02-06 23:40 also true, sorry 2008-02-06 23:40 :-) 2008-02-06 23:40 but I do need to create a filesystem myself, don't I? or is zumastor intelligent enough to do it? 2008-02-06 23:41 zumastor will just replication the upstream volume to the target 2008-02-06 23:41 no filesystem needs to exist on the target 2008-02-06 23:41 s/replication/replicate/ 2008-02-06 23:41 ACTION needs some zzz's 2008-02-06 23:41 mmm well, it's not working for me on my 3TB data + 3.5TB snapshots volume 2008-02-06 23:42 I'm going to destroy and re-setup everything to find out what's happenning 2008-02-06 23:42 sweet dreams 2008-02-06 23:42 the initial replication cycle can take a long time 2008-02-06 23:42 if all 3TB need to be copied 2008-02-06 23:42 no, only a test file of about 30 bytes :-) 2008-02-06 23:43 the rest of the volume has to be scanned, should be all zeros, will compress small 2008-02-06 23:43 but scanning takes a while 2008-02-06 23:43 I'm pretty sure I did something wrong because I setup replication on Friday and it was working 2008-02-06 23:43 ok 2008-02-06 23:43 not more tan 14 hours :-) 2008-02-06 23:43 well, sleep first, debug tomorrow I think 2008-02-06 23:44 it's morning here :-D 2008-02-06 23:44 this may end up as a bug entry re reporting errors better 2008-02-06 23:44 world clock fun 2008-02-06 23:44 and replication progress, if that is the issue 2008-02-06 23:44 yes 2008-02-06 23:44 well I have some sleep defificit to fix 2008-02-06 23:45 bye 2008-02-06 23:45 entire google just about spent the last 2 days at disneyland 2008-02-06 23:45 ACTION yawns 2008-02-06 23:45 see you irc.oftc.net #zumastor log beginning Thu Feb 7 00:00:01 PST 2008 2008-02-07 00:35 pgquiles: any luck? 2008-02-07 00:36 shapor: with replication? I've been interrupted with questions about SNMP :-/ 2008-02-07 00:36 I'm on it now 2008-02-07 02:26 replication is not working :-/ 2008-02-07 02:28 hm, whats the target log say? 2008-02-07 02:34 Thu Feb 7 10:10:14 CET 2008 /bin/zumastor[6090]: start daemon volume 'zumatest', target '10.0.2.21' 2008-02-07 02:34 Thu Feb 7 10:10:14 CET 2008 /bin/zumastor[6090]: waiting for new snapshot... 2008-02-07 02:34 shapor: does that mean it won't replicate until I run 'zumastor snapshot'? 2008-02-07 02:35 anyway, something's wrong 2008-02-07 02:36 I did 'zumastor forget target' and 'zumastor forget volume' on origin, and 'zumastor forget source' on the replica, then 'zumastor define volume' and 'zumastor define master' again on the origin and now snapshots are not working on origin 2008-02-07 02:37 or am I wrong and that is not supposed to work? 2008-02-07 02:45 mmm zumastor status --usage shows the snapshots but they are not mounted 2008-02-07 02:46 no, new snapshots are not being produced 2008-02-07 02:46 either I'm wrong or this is a bug in zumastor 2008-02-07 02:47 I'm trying to reproduce it starting from scratch 2008-02-07 02:50 ah i see you are specifying an ip address as the target from the log 2008-02-07 02:50 that, unfortunately, won't work 2008-02-07 02:50 needs to be a fully qualified host name 2008-02-07 02:51 oh 2008-02-07 02:51 good to know 2008-02-07 02:51 i believe that is noted in the doc 2008-02-07 02:52 but we know it is a lame limitation 2008-02-07 02:52 I can't find it in the howto or the man page 2008-02-07 02:54 shapor: what about the non-mounting snapshots? is that the right behavior? 2008-02-07 02:55 ah according to the howto my original "really aweful writeup" is the only documentation of replication 2008-02-07 02:55 http://zumastor.googlecode.com/svn/trunk/doc/zumastor-howto.html#_remote_replication_2 2008-02-07 02:56 the howto needs to cover replication 2008-02-07 02:57 actually, that isn't the wright doc either 2008-02-07 02:57 yes, I was adapting your install guide to the howto terms (essentially, changing 'testvol' to 'zumatest') 2008-02-07 02:57 did you find the howto useful in general? 2008-02-07 02:57 the doc should also say you do not need to mkfs the volume in the target 2008-02-07 02:58 yes, the howto is very useful 2008-02-07 02:58 good :) we need to improve it quite alot 2008-02-07 02:59 non-mounting snapshots? 2008-02-07 02:59 what do you mean 2008-02-07 02:59 [11:36] I did 'zumastor forget target' and 'zumastor forget volume' on origin, and 'zumastor forget source' on the replica, then 'zumastor define volume' and 'zumastor define master' again on the origin and now snapshots are not working on origin 2008-02-07 02:59 [11:37] or am I wrong and that is not supposed to work? 2008-02-07 02:59 oh i see (reading scrollback( 2008-02-07 02:59 :-) 2008-02-07 02:59 I wiped everything and am trying to reproduce it 2008-02-07 03:00 did you wipe/recreate usaing define volume --initialize ? 2008-02-07 03:00 no, without --initialize 2008-02-07 03:00 I wanted to keep data and snapshots 2008-02-07 03:01 ok without --initialize, you will see the old snapshots with --usage 2008-02-07 03:01 but zuastor wont know about them 2008-02-07 03:01 that's it 2008-02-07 03:01 you could "teach" it about them 2008-02-07 03:01 how? 2008-02-07 03:02 but it would require manually editing the /var/lib/zumastor/volumes/... 2008-02-07 03:02 ouch 2008-02-07 03:02 zumastor's metadata of which snapshots are for which purposes 2008-02-07 03:02 is stored in its fileystem data in /var/lib 2008-02-07 03:02 not on the volume itself 2008-02-07 03:03 so by zumastor forgetting volume, I wiped that, didn't I? 2008-02-07 03:03 yep 2008-02-07 03:04 also interesting to note in the howto :-) 2008-02-07 03:04 yes 2008-02-07 03:04 indeed 2008-02-07 03:04 sorry :( 2008-02-07 03:04 np 2008-02-07 03:04 I'm still testing this 2008-02-07 03:04 but I'd like to move it into production very soon 2008-02-07 03:08 shapor: why don't IPs work for replication? 2008-02-07 03:10 because zumastor is written in bash :) 2008-02-07 03:11 what's the problem with IPs in bash? :-? 2008-02-07 03:11 its a bit of a long story 2008-02-07 03:11 we have these zumastor target daemons for each target 2008-02-07 03:12 they all listen on a sock whos path is based on the target name 2008-02-07 03:12 in order to trigger replication you have yo find that socket path 2008-02-07 03:12 to* 2008-02-07 03:13 also, when downstream "requests" a snapshot from upstream 2008-02-07 03:13 it needs to find the target trigger as well 2008-02-07 03:14 and the only unique identifier it has is its hostname 2008-02-07 03:14 it is the only identifier a machine can have exactly one of 2008-02-07 03:14 a machine can have many ips 2008-02-07 03:14 and many names, too 2008-02-07 03:15 so lacking a cluster infrastructure with unique identifiers 2008-02-07 03:15 we cheated and used `uname -n` 2008-02-07 03:16 for now... 2008-02-07 03:16 we have several ideas on how to fix it 2008-02-07 03:16 but that is also the problem, we have several idea, not just one ;) 2008-02-07 03:16 :-D 2008-02-07 03:23 pgquiles: thanks for the feedback, i will work on the documention tomorrow 2008-02-07 03:23 i need to catch some zzz's 2008-02-07 03:23 yup, it must be really late there 2008-02-07 03:23 yeah after 3am 2008-02-07 03:24 thanks to you all for the help and developing zumastor 2008-02-07 04:59 -!- camgirl29(~camgirl29@d033.dhcp212-198-248.noos.fr) has joined #zumastor 2008-02-07 06:21 either I'm doing something very wrong or zumastor is using a wrong patch 2008-02-07 06:21 path 2008-02-07 06:21 Thu Feb 7 15:19:19 CET 2008 /bin/zumastor[5462]: new 'target/dubna' snapshot requested for volume 'zumatest' 2008-02-07 06:21 /bin/zumastor: '/var/lib/zumastor/volumes/zumatest/master/schedule/target/dubna' does not exist, doing nothing 2008-02-07 06:21 is it looking for /var/lib/zumastor/volumes/zumatest/targets/dubna/ maybe? 2008-02-07 06:23 the only file in /var/lib/zumastor/volumes/zumatest/master/schedule/ is "hourly" (because I defined -h 24) 2008-02-07 06:49 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-07 07:52 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-07 08:09 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-07 08:17 -!- lidi20(~lidi20@d033.dhcp212-198-248.noos.fr) has joined #zumastor 2008-02-07 09:00 good morning 2008-02-07 09:02 pgquiles, did you do "define target" ? 2008-02-07 09:04 flips: yes 2008-02-07 09:04 what was the exact command? 2008-02-07 09:05 zumastor define target zumatest dubna:11235 -p 3600 2008-02-07 09:05 the target server is accessible as 'dubna' or 'dubna.arisnova.local' 2008-02-07 09:06 try: tree -x /var/lib/zumastor 2008-02-07 09:07 I am guessing this is because of the :11235 2008-02-07 09:07 flips: replication specifying snapshot has not finished or suceeded, I'm not if it was aborted after another scheduled replication (without specifying snapshot) was automatically fired, if it has not finished transmissting data yet, or if it failed :-/ 2008-02-07 09:08 not being able to find dubna/ would prevent replication 2008-02-07 09:08 flips: http://rafb.net/p/L15zOB63.html 2008-02-07 09:08 did you try the tree command above? 2008-02-07 09:08 what's the problem with 11235? 2008-02-07 09:09 I am not sure we support port syntax in that way 2008-02-07 09:09 flips: and this is in dubna: http://rafb.net/p/Nj6Hq384.html 2008-02-07 09:09 try: tree -xf /var/lib/zumastor 2008-02-07 09:10 flips: that's the syntax the large volume copy test uses :-? 2008-02-07 09:10 tree -xf /var/lib/zumastor | grep dubna 2008-02-07 09:10 pgquiles, ok 2008-02-07 09:10 let's see what the command actually put in the database 2008-02-07 09:10 flips: http://rafb.net/p/mjpbsG21.html 2008-02-07 09:11 did you try: tree -xf /var/lib/zumastor | grep dubna 2008-02-07 09:11 ? 2008-02-07 09:12 flips: yes, the output is in http://rafb.net/p/mjpbsG21.html 2008-02-07 09:12 oh :) 2008-02-07 09:12 with the grep it is small enough just to paste into the channel 2008-02-07 09:12 O:-) 2008-02-07 09:14 /var/lib/zumastor/volumes/zumatest/master/schedule/target/dubna <- should be /targets/ not /target/ 2008-02-07 09:14 no, it does not exist 2008-02-07 09:15 what I hae is /var/lib/zumastor/volumes/zumatest/targets/dubna 2008-02-07 09:15 -!- mitchg(~mitchg@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-07 09:15 but I have no /var/lib/zumastor/volumes/zumatest/master/schedule/ directory 2008-02-07 09:16 I've repeated all the steps in the howto three times and that directory is not created here 2008-02-07 09:17 I am not sure whether it is a documentation bug or a script bug 2008-02-07 09:17 yet 2008-02-07 09:18 I think the howto reproduces the old install.html step-by-step regarding replication 2008-02-07 09:19 http://zumastor.googlecode.com/svn/trunk/doc/zumastor-howto.html <- this howto? 2008-02-07 09:19 yes 2008-02-07 09:22 /var/lib/zumastor/volumes/zumatest/master/schedule/target/dubna <- this is a bug 2008-02-07 09:22 in think the problem is in line 721 of /bin/zumastor 2008-02-07 09:23 looks suspicious indeed 2008-02-07 09:24 I tried replacing "target/$host" with "hourly" but it didn't work, either 2008-02-07 09:25 ("hourly" was just a stub) 2008-02-07 09:27 we will sort it out 2008-02-07 09:27 slowly working my way though it now 2008-02-07 09:27 cool :-) 2008-02-07 09:27 I don't look into the zumastor script much these days 2008-02-07 09:28 addressing your love to ddsnap? 2008-02-07 09:28 something like love 2008-02-07 09:28 more like needing a lot of work though :) 2008-02-07 09:29 got to catch up with those zfs guys 2008-02-07 09:29 hopefully pass them :) 2008-02-07 09:29 :-) 2008-02-07 09:29 I took a look at ZFS before looking into Zumastor, actually 2008-02-07 09:30 there is a lot to like about it 2008-02-07 09:30 indeed 2008-02-07 09:30 but they don't have replication, do they? 2008-02-07 09:30 things not to like about it: doesn't run on linux; grotesque layering violations 2008-02-07 09:31 they have a prototype 2008-02-07 09:31 and you cannot use it on a "normal" filesystem 2008-02-07 09:31 right 2008-02-07 09:31 I took a look at ext2cow, nilfs, etc, too 2008-02-07 09:31 ddsnap/zumastor is the way to go, I believe 2008-02-07 09:32 that's why I'm here :-) 2008-02-07 09:32 :) 2008-02-07 09:32 anyway, back to the bug 2008-02-07 09:32 ok 2008-02-07 09:32 after a while shapor will wake up 2008-02-07 09:32 and probably spot it in 15 seconds 2008-02-07 09:32 he went to bed past 3am 2008-02-07 09:32 I saw 2008-02-07 09:33 timezone fun 2008-02-07 09:33 I have to leave now and will be back in a couple of hours 2008-02-07 09:34 see you 2008-02-07 09:35 you need the fqdn, not just the hostname 2008-02-07 09:36 line 355 of bin/zumastor looks odd 2008-02-07 09:37 I was going to mention the fqdn, but something else seems to be broken 2008-02-07 09:37 we need a description of the fqdn issue posted somewhere, I always forget why we need it 2008-02-07 09:38 I dimly recall it was for a marginally bogus reason 2008-02-07 09:39 if [[ $kind =~ "^target/" ]]; then <- what is this regexing about? 2008-02-07 09:39 ACTION reads the comment 2008-02-07 09:40 looks like there is supposed to be a file expansion that did not happen 2008-02-07 09:40 leaving target/ in the generated pathname instead of the contents of something 2008-02-07 09:41 fragile 2008-02-07 09:42 ${kind/target\//} <- what is the \ for? 2008-02-07 09:47 shapor, where we are sending a structured command over the pipe, I think we better 2008-02-07 09:47 use a character that stands out more than '/' if it escapes into a pathname 2008-02-07 09:47 just a thought 2008-02-07 09:49 ok, this is supposed to find the token after the / and included it in the pathname: 2008-02-07 09:49 if [[ $kind =~ "^target/" ]]; then 2008-02-07 09:49 new_target_snapshot $vol ${kind/target\//} 2008-02-07 09:50 but it failed for some reason that is not clear to me 2008-02-07 09:51 regex for this application seems overkill to me, but maybe I just don't "get" modern scripting :) 2008-02-07 09:52 in other scripting languages I would "index" on the / and extract the substring after the slash 2008-02-07 09:54 anyway, it does seem to work, except that target/ actually did escape into the filename 2008-02-07 09:57 i chose / as a delimiter because it is the only character that can't be part of a path name 2008-02-07 09:57 I remember 2008-02-07 09:57 but now I ask myself why it matters, since we have complete control of those filenames 2008-02-07 09:58 that would require adding additional regex checks in bash 2008-02-07 09:58 brace expansion is the part of the bash manual I always skipped 2008-02-07 09:58 sure 2008-02-07 09:59 anyway it's not the bug 2008-02-07 09:59 using / seems like the right thing to do 2008-02-07 09:59 I was just mumbling 2008-02-07 09:59 reading the brace expansion section now 2008-02-07 10:02 anyway I'm reading the wrong section 2008-02-07 10:02 it's substitution, not expansion 2008-02-07 10:07 06:21 < pgquiles> /bin/zumastor: '/var/lib/zumastor/volumes/zumatest/master/schedule/target/dubna' does not exist, doing nothing 2008-02-07 10:07 thats from the master.log ? 2008-02-07 10:07 sounds plausible 2008-02-07 10:08 new_snapshot shouldn't be getting called with kind as target/blah 2008-02-07 10:08 ACTION unsays the substitution thing 2008-02-07 10:08 354 if [[ $kind =~ "^target/" ]]; then 2008-02-07 10:08 355 new_target_snapshot $vol ${kind/target\//} >> $log 2>&1 2008-02-07 10:08 356 else 2008-02-07 10:08 357 new_snapshot $vol $kind >> $log 2>&1 2008-02-07 10:08 358 fi 2008-02-07 10:20 so how did "if [[ $kind =~ "^target/" ]]; then" fail and let the target/ through? 2008-02-07 10:26 it is not obvious how to find the position of a character in a string, in bash 2008-02-07 10:46 shapor: new_target_snapshot $vol ${kind#target/} 2008-02-07 10:47 don't need regex matching and it's shorter :) 2008-02-07 10:58 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-07 10:58 wecome back 2008-02-07 11:08 hi 2008-02-07 11:08 any luck with the replication? 2008-02-07 11:38 pgquiles: did you try it using fqdns ? 2008-02-07 11:39 shapor: yes, same problem 2008-02-07 11:39 what os are you running on? 2008-02-07 11:39 ubuntu server 7.10 2008-02-07 11:40 whats the output of /bin/bash --version 2008-02-07 11:40 GNU bash, version 3.2.25(1)-release (i486-pc-linux-gnu) 2008-02-07 11:42 can you try 2008-02-07 11:42 test="target/test" 2008-02-07 11:42 [[ $test =~ "^target/" ]] && echo yes 2008-02-07 11:42 does that output "yes" ? 2008-02-07 11:43 shapor: no, it does not 2008-02-07 11:43 !!! 2008-02-07 11:44 what bash are you running? 2008-02-07 11:44 GNU bash, version 3.1.17(1)-release (i486-pc-linux-gnu) 2008-02-07 11:44 yay bash 2008-02-07 11:44 it works without quotes 2008-02-07 11:44 hrm 2008-02-07 11:45 test=target/test 2008-02-07 11:45 [[ $test =~ ^target/ ]] && echo yes 2008-02-07 11:45 what's your output without quotes? 2008-02-07 11:45 i think the quotes only matter in the regex 2008-02-07 11:45 it works without quotes 2008-02-07 11:46 so its a subtlety in the regex parsing 2008-02-07 11:46 this also works for me: 2008-02-07 11:46 test=target/test 2008-02-07 11:46 oops 2008-02-07 11:46 test="target/test" 2008-02-07 11:46 [[ $test =~ ^target/ ]] && echo yes 2008-02-07 11:46 yeah 2008-02-07 11:46 thats what we should change it to 2008-02-07 11:46 damn quotes! :-D 2008-02-07 11:48 Committed revision 1337. 2008-02-07 11:48 hah 2008-02-07 11:49 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-07 11:49 ok this is the leet revision 2008-02-07 11:52 shapor: to trunk, to 0.6 or to both? 2008-02-07 11:53 trunk, should probably put it in 0.6 as well 2008-02-07 11:53 0.6.1, maybe? 2008-02-07 12:41 shapor, and how about my non-regex expansion, ${kind#target/} 2008-02-07 12:41 not the part that broke, but still 2008-02-07 12:41 I wonder what the truth about the quotes is 2008-02-07 12:42 my rational side says it should work with or without quotes 2008-02-07 12:44 [[ "^target/" == ^target/ ]]; echo $? 2008-02-07 12:44 outputs 0 2008-02-07 12:45 flips: same here with bash 3.25 2008-02-07 12:45 3.2.25? 2008-02-07 12:45 yes, sorry 2008-02-07 12:45 which bash did it break on? 2008-02-07 12:46 mine 2008-02-07 12:46 which is? 2008-02-07 12:46 3.2.25 2008-02-07 12:46 right 2008-02-07 12:48 I'm searching the CHANGES file in bash 3.2 for any reference to the quotes issue 2008-02-07 12:50 p. The pattern substitution code no longer performs quote removal on the 2008-02-07 12:50 pattern before trying to match it, as the pattern removal functions do. 2008-02-07 12:51 might it be that? 2008-02-07 12:52 probably not, that was back in bash 3.0 2008-02-07 12:57 then there's bash32-010: 2008-02-07 12:57 Bash used backslashes 2008-02-07 12:57 to quote all characters when the pattern argument to the [[ special 2008-02-07 12:57 command's =~ operator was quoted. This caused the match to fail on Linux 2008-02-07 12:57 and other systems using GNU libc 2008-02-07 13:02 I would like to get a bug report out of this if we can 2008-02-07 13:13 I'm building a bash 3.2.33 to test the regexp vs quotes issue 2008-02-07 13:29 shapor's test fails with 3.2.33 the same way it failed with 3.2.25 2008-02-07 13:51 -!- cbsmith(~xman@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-07 14:01 but it works fien with 3.2.0 2008-02-07 14:01 fine 2008-02-07 14:03 wtf! it works fine with 3.2.33 and 3.2.25!? 2008-02-07 14:05 ok, ok, I was quoting test="target/test" instead of the regexp 2008-02-07 14:25 not debian/ubuntu-specific: it works with bash 3.1, it does not with bash 3.2.0 2008-02-07 14:35 nothing in bash 3.2 CHANGES file indicates the change in behavior is desired. I've filed a bug report. 2008-02-07 14:58 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-07 15:14 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-07 15:51 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-07 16:02 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-07 16:19 should I be worried about "Fri Feb 8 01:12:08 2008: [5293] event_parse_options: invalid count in DDSNAP_COUNT"? :-? 2008-02-07 16:21 yay! replication is working! 2008-02-07 16:21 btw, I discovered something about fqdn's 2008-02-07 16:21 they don't work either 2008-02-07 16:22 in zumastor define target and zumastor define source you have to use the same name `uname -n` returns 2008-02-07 16:22 anything else, replication fails 2008-02-07 16:22 Interesting 2008-02-07 16:23 pgquiles: odd 2008-02-07 16:25 this is the error you receive if `uname -n` != fqdn: 2008-02-07 16:25 Fri Feb 8 00:51:13 CET 2008 /bin/zumastor[5448]: target trigger for dubna does not exist 2008-02-07 16:25 pgquiles_: Definitely create an issue for that. 2008-02-07 16:26 cbsmith: dan kegel created it this morning (yesterday for you) 2008-02-07 16:26 I had just added my comment on that 2008-02-07 16:26 ah 2008-02-07 16:26 -!- dkegel(~chatzilla@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-07 16:26 actually, you could call it misconfiguration 2008-02-07 16:27 ACTION wonders if dkegel has a surveillance bot in the channel :-D 2008-02-07 16:27 pgquiles_: that'd be me 2008-02-07 16:27 if /etc/hostname contained a fqdn, fqdn would had worked 2008-02-07 16:27 cbsmith: oh :-D 2008-02-07 16:28 replication is working fine but snapshots are being lost in the replica, only the last snapshot "survives" 2008-02-07 16:28 I'm trying this with the 5G data + 5G snapshots example 2008-02-07 16:29 dkegel: I'm going to try to resize 2008-02-07 16:29 do I need to run zumastor stop master, zumastor stop source and/or zumastor stop target before I attempt lvresize? 2008-02-07 16:30 you do need to stop everything, if I am not mistaken 2008-02-07 16:30 ok, doing now 2008-02-07 16:33 the howto is little wrong in "Resize a Zumastor volume" 2008-02-07 16:33 it says testvol where it should say zumatest 2008-02-07 16:34 (copy & paste from install.html I guess :-) 2008-02-07 16:34 some warnings about read only variables: 2008-02-07 16:34 root@spectrum:/home/arisnova# zumastor resize zumatest --origin 10G 2008-02-07 16:34 Device sizes after resizing: 2008-02-07 16:34 origin device 10737418240 2008-02-07 16:35 snapshot device 5368709120 2008-02-07 16:35 metadata device 5368709120 2008-02-07 16:35 /lib/zumastor/common: line 443: local: vol: readonly variable 2008-02-07 16:35 /lib/zumastor/common: line 148: local: vol: readonly variable 2008-02-07 16:35 /lib/zumastor/common: line 219: local: vol: readonly variable 2008-02-07 16:35 /lib/zumastor/common: line 223: local: server: readonly variable 2008-02-07 16:35 /lib/zumastor/common: line 148: local: vol: readonly variable 2008-02-07 16:38 ok, done 2008-02-07 16:39 mmm I guess df showing different sizes for the snapshots before and after the resizing is right: 2008-02-07 16:39 Filesystem Size Used Avail Use% Mounted on 2008-02-07 16:40 Looks like bash is wining that a local read only variable is hiding a global read-only variable. 2008-02-07 16:40 yes, that is the expected behavior 2008-02-07 16:40 /dev/mapper/zumatest(16) 2008-02-07 16:40 5.0G 139M 4.6G 3% /var/run/zumastor/snapshot/zumatest/2008.02.08-01.28.07 2008-02-07 16:40 /dev/mapper/zumatest(18) 2008-02-07 16:40 9.9G 140M 9.3G 2% /var/run/zumastor/snapshot/zumatest/2008.02.08-01.37.12 2008-02-07 16:40 pgquiles_: That looks good. 2008-02-07 16:41 cbsmith: indeed 2008-02-07 16:42 ooops 2008-02-07 16:42 not that good, replication is not working: 2008-02-07 16:43 got client connection 2008-02-07 16:43 processing 2008-02-07 16:43 Fri Feb 8 01:42:34 2008: [10801] apply_delta_extents: could not read header for extent starting at chunk 13 of 5174 total chunks: Input/output error 2008-02-07 16:43 Fri Feb 8 01:42:34 2008: [10801] ddsnap_delta_server: closing connection on error: unable to apply upstream delta to device "/dev/mapper/zumatest" 2008-02-07 16:43 Fri Feb 8 01:42:34 2008: [10800] ddsnap_delta_server: unable to accept connection: Interrupted system call 2008-02-07 16:43 Fri Feb 8 01:42:34 2008: [10800] ddsnap_delta_server: Caught signal 17 2008-02-07 16:43 Fri Feb 8 01:42:36 2008: [10800] ddsnap_delta_server: unable to accept connection: Interrupted system call 2008-02-07 16:43 Fri Feb 8 01:42:36 2008: [10800] ddsnap_delta_server: Caught signal 15 2008-02-07 16:43 Fri Feb 8 01:42:36 2008: [10821] daemonize: starting at Fri Feb 8 01:42:36 2008 2008-02-07 16:44 yes, that seems quite bad 2008-02-07 16:44 lovely. 2008-02-07 16:44 I'm not sure what's happenning, e2fsck said it had fixed the filesystem on the target after resizing 2008-02-07 16:44 it did not complain about corrupt filesystem on the origin, though 2008-02-07 16:44 that seems some problem in replication after resizing 2008-02-07 16:45 jiayingz is about to say something very embarassing 2008-02-07 16:45 pgquiles_: did you give the remote side more room as well? 2008-02-07 16:45 I haven't tested replication after resizing 2008-02-07 16:45 cbsmith: yes, the exact same size on both sides 2008-02-07 16:45 I believe this is what they call a brown-paper-bag bug 2008-02-07 16:46 ACTION puts on the brown-paper-bag 2008-02-07 16:46 :-D 2008-02-07 16:47 what is the size on target? 2008-02-07 16:48 formerly, 5G 2008-02-07 16:48 now, 10G 2008-02-07 16:48 on both sides 2008-02-07 16:48 ext3, in case it helps 2008-02-07 16:48 So now we will start testing replication after resize here. Can you continue testing without resize? 2008-02-07 16:49 the disks are plugged to a RAID controller with a 512MB cache, is that important? 2008-02-07 16:49 dkegel: sure, resizing is not important for me now 2008-02-07 16:49 ACTION spins up some test nodes 2008-02-07 16:49 that should not be the problem. I guess the problem is ddsnap replication code does not handle different snapshot sizes correctly 2008-02-07 16:50 I will look at 2008-02-07 16:50 I will try replication after resizing again tomorrow, in case I have missed anything 2008-02-07 16:50 I must go to bed now, it's almost 2am here and I wake up at 6.30am 2008-02-07 16:50 see you in a few hours :-) 2008-02-07 17:25 after some futzing, ive got test nodes up 2008-02-07 18:07 ick. /lib/zumastor/common: line 51: /dev/stdout: No such device or address 2008-02-07 18:09 That bit breaks when your running commands over ssh 2008-02-07 18:26 patch mailed 2008-02-07 21:00 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-07 22:13 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-07 22:25 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-07 22:38 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-07 22:53 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-07 23:02 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor irc.oftc.net #zumastor log beginning Fri Feb 8 00:00:01 PST 2008 2008-02-08 00:34 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-08 05:02 -!- zumalog(~zumalog@yzf.shapor.com) has joined #zumastor 2008-02-08 05:55 -!- vda(~vda@ip-89-102-33-54.karneval.cz) has joined #zumastor 2008-02-08 06:06 -!- vda(~vda@ip-89-102-33-54.karneval.cz) has joined #zumastor 2008-02-08 07:10 -!- charlesnw(~charles@ses.siderean.com) has joined #zumastor 2008-02-08 07:35 zumastor is skipping snapshot numbers 2008-02-08 07:35 Filesystem 1K-blocks Used Available Use% Mounted on 2008-02-08 07:35 /dev/mapper/zumatest 51606140 184312 48800388 1% /var/run/zumastor/mount/zumatest 2008-02-08 07:35 /dev/mapper/zumatest(0) 2008-02-08 07:35 51606140 184272 48800428 1% /var/run/zumastor/snapshot/zumatest/2008.02.08-15.29.55 2008-02-08 07:35 /dev/mapper/zumatest(2) 2008-02-08 07:35 51606140 184280 48800420 1% /var/run/zumastor/snapshot/zumatest/2008.02.08-15.30.18 2008-02-08 07:35 /dev/mapper/zumatest(4) 2008-02-08 07:35 51606140 184280 48800420 1% /var/run/zumastor/snapshot/zumatest/2008.02.08-15.30.20 2008-02-08 07:35 /dev/mapper/zumatest(6) 2008-02-08 07:35 51606140 184284 48800416 1% /var/run/zumastor/snapshot/zumatest/2008.02.08-15.30.23 2008-02-08 07:35 /dev/mapper/zumatest(8) 2008-02-08 07:35 51606140 184288 48800412 1% /var/run/zumastor/snapshot/zumatest/2008.02.08-15.31.00 2008-02-08 07:36 /dev/mapper/zumatest(12) 2008-02-08 07:36 51606140 184296 48800404 1% /var/run/zumastor/snapshot/zumatest/2008.02.08-15.31.36 2008-02-08 07:36 /dev/mapper/zumatest(18) 2008-02-08 07:36 51606140 184300 48800400 1% /var/run/zumastor/snapshot/zumatest/2008.02.08-15.52.22 2008-02-08 07:36 /dev/mapper/zumatest(20) 2008-02-08 07:36 51606140 184304 48800396 1% /var/run/zumastor/snapshot/zumatest/2008.02.08-16.33.50 2008-02-08 07:36 /dev/mapper/zumatest(24) 2008-02-08 07:36 51606140 184308 48800392 1% /var/run/zumastor/snapshot/zumatest/2008.02.08-16.34.12 2008-02-08 07:36 is that right? snapshots 10, 14, 16 and 22 are missing :-? 2008-02-08 07:37 mmm it seems the missing snapshots are the ones created on replication 2008-02-08 09:44 hi pgquiles 2008-02-08 09:46 could you have a look at the /var/log/zumastor/zumatest/server.log and /var/log/zumastor/zumatest/master.log 2008-02-08 09:52 this may happen if snapshots 10, 14, 16, and 22 were used by replication 2008-02-08 10:01 jiayingz: yes, that was it 2008-02-08 10:02 jiayingz: in the replica, zumastor status --usage shows there are two snapshots in the snapstore but only one of them is mounted 2008-02-08 10:03 why is that? I thought the problem with replicated snapshots was zumastor was keeping only one snapshot in the replica (the most recent one) 2008-02-08 10:12 pgquiles, we replay the journal of the replicated filesystem on the target side, which changes the snapshot, but since we also need an exact replica to receive the next delta, we take an additional snapshot on the target and replay the journal on that 2008-02-08 10:13 did that make sense? 2008-02-08 10:14 flips: thank you 2008-02-08 11:22 should I 'zumastor stop target' before 'zumastor stop master'? and 'zumastor stop master' before 'zumastor forget volume'? 2008-02-08 11:24 we _should_ allow all of those in any order 2008-02-08 11:25 but I have not tested all combinations myself 2008-02-08 11:26 ok 2008-02-08 11:26 when I was trying resizing yesterday I wondered if it failed because I was missing any of those 2008-02-08 11:27 btw, why is there a 'zumastor start volume' but no 'zumastor stop volume'? :-? 2008-02-08 11:27 no good reason 2008-02-08 11:27 needs a bug filed :) 2008-02-08 11:27 :-) 2008-02-08 11:30 issue 57 filed 2008-02-08 11:50 thanks 2008-02-08 11:50 I have a prototype python script that does this right, I think I might post it 2008-02-08 11:50 and invite somebody to fill in the missing bits 2008-02-08 11:51 do you mean /bin/zumastor is moving from bash to python? 2008-02-08 11:51 no, just that I wrote a prototype in python to see how it would look 2008-02-08 11:51 ok 2008-02-08 11:51 and to learn python :) 2008-02-08 11:51 :-) 2008-02-08 11:52 I'm more of a ruby guy 2008-02-08 11:52 for historical reasons which no longer are valid 2008-02-08 11:52 I ran the online ruby tutorial 2008-02-08 11:52 :-D 2008-02-08 11:52 it's a small world 2008-02-08 11:53 ran, not wrote 2008-02-08 12:01 "EU investigates Microsoft's OOXML campaign" http://www.theregister.co.uk/2008/02/08/ooxml_eu_probe_iso/ 2008-02-08 12:01 :) 2008-02-08 12:06 MS will spend a couple more millions with those "investigators" 2008-02-08 12:06 http://daringfireball.net/2008/02/yahoo_translation 2008-02-08 12:06 :-) 2008-02-08 12:16 lol 2008-02-08 12:55 We should do something about the bashisms test that fails every test run 2008-02-08 12:56 either remove the test, or make it pass 2008-02-08 12:56 remove the test 2008-02-08 12:56 what was it designed to test? 2008-02-08 13:03 I think it was designed to express an opinion 2008-02-08 13:04 there is also some concern about debian's more limited shell used in some parts of debian installation 2008-02-08 13:04 but I can't thinking of any zumastor code that would be affected by that 2008-02-08 13:23 regarding issue 56, do you know where one would hunt to clean that up 2008-02-08 13:24 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-08 14:06 flips: ^ 2008-02-08 14:21 flips: I might have been the one who caused the existence of the bashisms test 2008-02-08 14:21 flips: it was back when zumastor had #!/bin/sh instead of #!/bin/bash 2008-02-08 14:21 #!/bin/sh is dash in ubuntu (and probably debian, too), therefore some bash-specific features failed 2008-02-08 14:23 willn: ^ 2008-02-08 14:30 I don't understand issue 59 :-D 2008-02-08 14:32 the reporter uses an invalid command-line and realizes about it but still files the bug report :-? 2008-02-08 15:42 pgquiles: That was me :D 2008-02-08 15:43 Its the error that was amusing, and reason for the bug report 2008-02-08 15:43 willn: :-D 2008-02-08 15:43 'ourly is not a valid hourly limit' 2008-02-08 15:53 ACTION is happy with the apt setup scripts 2008-02-08 21:21 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-08 22:07 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-08 22:20 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor irc.oftc.net #zumastor log beginning Sat Feb 9 00:00:01 PST 2008 2008-02-09 00:37 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-09 01:25 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-09 02:21 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-09 12:20 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-09 12:22 second line in zumastor.org: "The current release is 0.6, which adds support for offline resizing." 2008-02-09 12:22 quite unfortunate :-) 2008-02-09 13:08 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-09 13:33 pgquiles, unfortunate? 2008-02-09 13:36 flips: given that resizing does not work properly and breaks replication, advertising it as the great new feature in 0.6 looks unfortunate to me 2008-02-09 13:36 pgquiles, let's change that to "experimental support" 2008-02-09 13:43 pgquiles, how does it look now? 2008-02-09 13:45 hmm, we left a couple of current contributers off the list 2008-02-09 13:45 namely, pgquiles and benm 2008-02-09 13:46 and jeff shroeder 2008-02-09 13:46 it looks better 2008-02-09 13:46 see my post on resizing snapshots? 2008-02-09 13:46 I sent a patch to the howto warning about the broken resizing and telling about the tricks with FQDNs 2008-02-09 13:46 flips: yes 2008-02-09 13:46 fun stuff for the future 2008-02-09 13:47 indeed 2008-02-09 13:47 I'm sure somebody will find a use, like for example root filesystems for xen etc 2008-02-09 13:47 I've left a 3TB data + 3.25TB snapshots volume replicating for the weekend, let's see if it works fine 2008-02-09 13:48 would be the biggest replication test so far 2008-02-09 13:49 btw, at the beginning of the replication there's some place where zumastor needs the date but it gets "Creation date" instead 2008-02-09 13:50 replication works fine but given that no progress is reported and first replication might take quite some time, it's a bit disturbing 2008-02-09 13:51 I haven't found the time to debug that and fix it, though :-( 2008-02-09 14:13 pqquiles, what gets creation date? 2008-02-09 14:13 shapor has a little script to monitor replication progress 2008-02-09 14:13 we need to post it somewhere 2008-02-09 14:13 and think about making it a zumastor feature 2008-02-09 14:14 maybe: zumastor status --monitor 2008-02-09 14:14 or something 2008-02-09 14:15 flips: I cannot remember where in the logs is the error, I'm vpn'ing to work 2008-02-09 14:15 when you find it again, just open an issue 2008-02-09 14:19 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-09 14:19 in the replica, in source.log: /bin/zumastor[5780]: umounting zumatest(1) on source stop 2008-02-09 14:19 date: invalid date ` Creation time' 2008-02-09 14:27 whoops 2008-02-09 14:27 typical bash bug 2008-02-09 14:33 pgquiles, re bashism, the correct fix is to change /bin/sh to /bin/bash everywhere that can't possibly end up in an environment where bash is not available 2008-02-09 14:35 flips: I cannot imagine such an environment, given that /bin/zumastor requires bash (#!/bin/bash in header) and it only ssh's machines which have zumastor installed 2008-02-09 14:35 I can't either 2008-02-09 14:36 I wonder if anybody would ever want to put zumastor in an initrd 2008-02-09 14:37 well, if they do then they can put in bash as well 2008-02-09 14:37 in /bin/zumastor, run_remote uses 'sh' instead of 'bash' 2008-02-09 14:38 that should be fixed 2008-02-09 14:38 yes 2008-02-09 14:38 want to enter the issue? 2008-02-09 14:38 doing now :-) 2008-02-09 14:39 could that be why the pattern match failed yesterday? 2008-02-09 14:40 I can't remember now 2008-02-09 14:40 I'll take a look at the log from the bug 2008-02-09 14:40 and I will see if the bash man page has a non-regex way of doing the test 2008-02-09 14:47 flips: probably 'sh' instead of 'bash' is the reason bashisms.sh is failing 2008-02-09 14:47 look at this: 2008-02-09 14:47 update-alternatives --install /bin/sh sh /bin/dash 1 2008-02-09 14:48 I've added a bash dependency to my ubuntu packages, to make sure it's always there 2008-02-09 14:50 yes, that would be the reason 2008-02-09 14:54 how about this: kind=target/daily; [[ ${kind#target/} != $arg ]] ; echo $? 2008-02-09 14:54 evaluates to true iff $kind starts with target/ 2008-02-09 14:55 no regex 2008-02-09 14:56 to make it clearer: kind=daily; [[ ${kind#target/} != $arg ]] ; echo $? kind: ${kind#target/} 2008-02-09 14:56 clearer? lol 2008-02-09 14:56 :-) 2008-02-09 14:56 well the payload is just [[ ${kind#target/} != $arg ]] 2008-02-09 14:56 the rest is a unit text ;) 2008-02-09 14:57 the # form is covered under "parameter expansion" in the bash man page 2008-02-09 14:58 it just strips a string off the beginning of a word, if the string matches exactly 2008-02-09 14:58 in other words, just what we want in order to detect and decude a formatted string beginning with "target/" 2008-02-09 14:59 decode that is 2008-02-09 14:59 the typical way is to use a regex, which is overkill 2008-02-09 15:32 -!- cbsmith(~xman@expo-140.socallinuxexpo.org) has joined #zumastor 2008-02-09 17:51 Sun Feb 10 01:27:10 CET 2008 /bin/zumastor[4846]: new snapshot will be '60' 2008-02-09 17:51 error writing to /var/lib/zumastor/volumes/zumatest/targets/dubna/trigger 2008-02-09 17:51 Sun Feb 10 01:27:46 CET 2008 /bin/zumastor[4846]: dropping snapshot for zumatest(60) 2008-02-09 17:51 Sun Feb 10 01:37:11 CET 2008 /bin/zumastor[4846]: new snapshot will be '62' 2008-02-09 17:51 error writing to /var/lib/zumastor/volumes/zumatest/targets/dubna/trigger 2008-02-09 17:51 Sun Feb 10 01:38:25 CET 2008 /bin/zumastor[4846]: dropping snapshot for zumatest(62) 2008-02-09 17:51 device-mapper: remove ioctl failed: Device or resource busy 2008-02-09 17:51 Command failed 2008-02-09 17:51 Sun Feb 10 01:38:25 CET 2008 /bin/zumastor[4846]: remove device failed for zumatest(62) 2008-02-09 17:51 not good 2008-02-09 17:51 getting late over there 2008-02-09 17:51 indeed 2008-02-09 17:52 it's a 3TB volume which has not replicated yet (although replication was started more than 4 hours ago) 2008-02-09 17:52 I'm copying about 100GB of data 2008-02-09 17:52 is zumastor able to take a snapshot while data is being copied? 2008-02-09 17:52 yes 2008-02-09 17:53 no idea what's happenning, then 2008-02-09 17:53 it's a local copy? 2008-02-09 17:53 thinking about it 2008-02-09 17:53 so you copied an additional 100GB of data onto the upstream volume? 2008-02-09 17:54 yes 2008-02-09 17:54 I took snapshots, then started replication 2008-02-09 17:55 then about half an hour later, I started to transfer about 100GB of data from another server to the origin 2008-02-09 17:55 according to the target, the replication is going on 2008-02-09 17:55 but the transfer rate is astonishingly low 2008-02-09 17:56 in the order of kilobytes/second 2008-02-09 17:56 how do you measure the transfer rate? 2008-02-09 17:56 zumastor status --usage 2008-02-09 17:56 look at " apply: 10 25068646/196608000 410724696064" 2008-02-09 17:56 wait a few seconds 2008-02-09 17:56 zumastor status --usage 2008-02-09 17:57 apply: 10 25101075/196608000 411256012800 2008-02-09 17:57 :-) 2008-02-09 17:58 maybe it's because it's creating the filesystem (that's still the first replica) 2008-02-09 17:59 so you look at the usage in the downstream snapshot, figure out how many blocks that is, and divide by the time? 2008-02-09 18:00 right, the apply file 2008-02-09 18:00 unintuitive name 2008-02-09 18:01 pedestrian but gives a rough estimate 2008-02-09 18:01 102GB and still copying data to the origin :-) 2008-02-09 18:01 it'll have about 300GB when it's finished 2008-02-09 18:02 let's see if zumastor is able to cope with that 2008-02-09 18:03 so its sending snapshot 10? 2008-02-09 18:03 yes 2008-02-09 18:03 that snapshot only contains a 30 bytes text file 2008-02-09 18:03 ok, 112 bytes text file 2008-02-09 18:04 chunk size is 4K? 2008-02-09 18:04 there were six snapshots of that file, with different contents 2008-02-09 18:04 or, hmm, I think we default to 16K 2008-02-09 18:04 I think it was 16k 2008-02-09 18:04 right 2008-02-09 18:05 I've seen it but I can't remember where 2008-02-09 18:05 so it has sent 1.6 TB so far? 2008-02-09 18:05 no 2008-02-09 18:06 IIUC it will send 200 MB for the first snapshot (196608000 bytes, exactly) 2008-02-09 18:06 sorry, calculated wrong 2008-02-09 18:06 I don't know what the third number is 2008-02-09 18:06 fourth, I mean, the last one 2008-02-09 18:06 320 MB, does that sound right? 2008-02-09 18:07 that is 196608000 * 2**14 2008-02-09 18:07 makes sense 2008-02-09 18:08 410MB 2008-02-09 18:08 that is 25068646 * 2**14 2008-02-09 18:09 finally did it right I think 2008-02-09 18:10 ack 2008-02-09 18:10 410GB 2008-02-09 18:10 28.5 MB/sec 2008-02-09 18:11 so did I get that right? 2008-02-09 18:11 410GB? 2008-02-09 18:11 /dev/mapper/zumatest(12) 2008-02-09 18:11 2.9T 201M 2.9T 1% /var/run/zumastor/snapshot/zumatest/2008.02.09-21.37.11 2008-02-09 18:11 4 hours replicating, 25068646 chunks at 16k/chunk 2008-02-09 18:12 ah, ok 2008-02-09 18:12 I thought 25068646 were bytes instead of chunks 2008-02-09 18:12 but why is it sending 410GB? 2008-02-09 18:12 the snapshot is only 201MB 2008-02-09 18:12 is there a whopping 409.8GB overhead? 2008-02-09 18:12 good question 2008-02-09 18:12 no 2008-02-09 18:13 :-) 2008-02-09 18:13 it seems to be replicating the whole volume 2008-02-09 18:13 is this the first replication cycle? 2008-02-09 18:13 yes 2008-02-09 18:13 ok, expected 2008-02-09 18:13 but that's unlikely, too 2008-02-09 18:13 what is unlikely? 2008-02-09 18:13 the volume is 3000GB data 2008-02-09 18:13 not 410GB 2008-02-09 18:14 25068646 * 2**14 = 410 GB or else I have brain damage 2008-02-09 18:14 which is always possible ;) 2008-02-09 18:15 ahh 2008-02-09 18:15 ok 2008-02-09 18:15 it has transferred 410GB and will transfer 3TB 2008-02-09 18:15 or, if I used binary gigabytes, 25068646 * 2**14 / 2**30. = 382 gibibytes 2008-02-09 18:15 I thought you meant it will transfer 410GB 2008-02-09 18:15 yes, 100% right 2008-02-09 18:15 what kind of link is it going over? 2008-02-09 18:16 gigabit ethernet 2008-02-09 18:16 probably not using a high percentage of the bandwidth 2008-02-09 18:16 both servers in the same switch 2008-02-09 18:16 probably 2008-02-09 18:16 it's bottlenecking on snapshot read or snapshot write, still 30 MB/sec isn't totally awful 2008-02-09 18:17 maybe it's the SATA disks 2008-02-09 18:17 just less than we are capable of 2008-02-09 18:17 it's us, we only read/write snapshots at around 30 MB/sec 2008-02-09 18:17 let's see what kind of information the switch shows 2008-02-09 18:17 due for optimization 2008-02-09 18:17 oh 2008-02-09 18:17 optimization for what? 2008-02-09 18:18 snapshot read locking is not optimal 2008-02-09 18:18 ok 2008-02-09 18:18 has to wait synchronously to get a lock on each 16k chunk read 2008-02-09 18:18 possible to optimize that a lot, but we thought we should try for stability first 2008-02-09 18:19 agreed 2008-02-09 18:19 ok, now what about the device mapper remove problem above 2008-02-09 18:19 it'd be nice if it could realize 99% of the volume are zeros and just take that from /dev/null on the replica 2008-02-09 18:19 yes, that we can do 2008-02-09 18:20 again a question of being sure about base stability before optimizing 2008-02-09 18:20 how is your cpu usage on upstream and downstream? 2008-02-09 18:21 upstream. 2008-02-09 18:21 : 2008-02-09 18:21 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- 2008-02-09 18:21 r b swpd free buff cache si so bi bo in cs us sy id wa 2008-02-09 18:21 1 3 23952 18236 784184 3110572 0 0 76 143 12 66 5 3 86 6 2008-02-09 18:21 0 3 23952 15264 784400 3113452 0 0 16663 77914 13098 35440 20 18 3 60 2008-02-09 18:21 3 3 23952 21624 779640 3111664 0 0 16413 76378 11926 33583 18 17 2 64 2008-02-09 18:22 downstream: 2008-02-09 18:22 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- 2008-02-09 18:22 r b swpd free buff cache si so bi bo in cs us sy id wa 2008-02-09 18:22 0 0 0 3068636 837620 67528 0 0 1 123 132 104 2 1 96 1 2008-02-09 18:22 0 0 0 3067852 838412 67584 0 0 0 5466 2710 3337 4 2 93 1 2008-02-09 18:22 0 0 0 3067852 838308 67568 0 0 0 5025 2397 3204 3 2 95 1 2008-02-09 18:22 the CPUs are not working 2008-02-09 18:22 no, so the real time is getting sucked away by disk seeking and waiting to get snapshot read locks 2008-02-09 18:23 both of which are due for optimization 2008-02-09 18:23 what speed are you disks? 2008-02-09 18:23 7.2krpm 2008-02-09 18:23 16MB cache 2008-02-09 18:23 how many on each end? 2008-02-09 18:23 9 x 750GB 2008-02-09 18:23 identical disks 2008-02-09 18:23 plugged to a HP Smart Array P400 with a cache of 512MB 2008-02-09 18:23 how much bandwidth on the controller? 2008-02-09 18:24 http://h18000.www1.hp.com/products/quickspecs/12400_na/12400_na.html 2008-02-09 18:24 2GB/s 2008-02-09 18:25 when from/to cache 2008-02-09 18:25 so we should be doing this about 60 times faster 2008-02-09 18:25 right? 2008-02-09 18:25 2GB / 30 MB 2008-02-09 18:25 assuming we do not then bottleneck on the network 2008-02-09 18:26 66.6 times faster, yes 2008-02-09 18:26 have we got a way of measuring the traffic on your gige right now? 2008-02-09 18:27 the java web interface is locked 2008-02-09 18:27 I'm trying to revive it 2008-02-09 18:27 nothing to do with us? 2008-02-09 18:28 no, probably just the slow uplink from work (only 30kb/s) 2008-02-09 18:28 we should teach ddsnap to keep stats on network traffic 2008-02-09 18:28 I should put in an issue for that 2008-02-09 18:28 in theory, generic network tools should be doing that 2008-02-09 18:29 well there should be some place in proc that has some stats 2008-02-09 18:29 when read/write rate is optimized, ssh should be replaced or encryption be simplified 2008-02-09 18:30 otherwise, cpu's will go crazy 2008-02-09 18:30 the replication does not go over ssh 2008-02-09 18:30 only the initial handshake does 2008-02-09 18:30 according to the switch, only 5% of the bandwith in those ports is in use 2008-02-09 18:31 ok, so we could improve replication speed in this case by a factor of 20 by removing bottlenecks in snapshot read and write 2008-02-09 18:31 that is about what we expect from doing the optimizations I posted on the list 2008-02-09 18:32 I thought data was transferred over ssh 2008-02-09 18:32 no, straight over the network 2008-02-09 18:33 it is possible to tunnel of course 2008-02-09 18:33 take the data about the network with a grain of salt 2008-02-09 18:33 I'm seeing lots of collisions from 20 hours ago, long before I started replication 2008-02-09 18:34 they stopped about 17 hours ago, 11 hous before I started replication 2008-02-09 18:34 but I'll have to find what caused that 2008-02-09 18:34 ok, how about posting some numbers for cpu and network bandwidth to the list, if you can get them conveniently? 2008-02-09 18:34 fine 2008-02-09 18:34 sounds like it is working as expected, except for the device mapper remove error above 2008-02-09 18:35 going back to look at that now 2008-02-09 18:35 I'll have to stop adding data to the origin to have reliable numbers 2008-02-09 18:35 I have 114G of data now, I think it's enough to test :-) 2008-02-09 18:36 is there any difference in performance between small files and large files? 2008-02-09 18:36 we have some large files but they have not been copied yet 2008-02-09 18:36 large files are worst case for ddsnap 2008-02-09 18:36 at the moment 2008-02-09 18:40 can you post the "remove ioctl failed: Device or resource busy" issue to the list, please? 2008-02-09 18:41 yes 2008-02-09 18:41 thanks 2008-02-09 18:41 and consider getting some sleep? 2008-02-09 18:46 yes, I should do that 2008-02-09 18:46 by the way, the 9 disks should deliver about 540 MB/sec raw, something less when raided 2008-02-09 18:47 they are in raid 6 2008-02-09 18:48 so we are only about 7-8 times slower on read from snapshot + write to origin than we should be, better than 60 2008-02-09 18:48 right 2008-02-09 18:48 even slower 2008-02-09 18:49 somewhere around 400 MB/sec raw if everything is singing 2008-02-09 18:50 I found this about a week ago: 2008-02-09 18:50 http://insights.oetiker.ch/linux/raidoptimization.html 2008-02-09 18:50 it looks interesting but so far I have only skimmed over it 2008-02-09 18:51 "Often the stripe-size is 64 KByte, this means that everything should be aligned to 64 KByte." 2008-02-09 18:51 what is the effect of chunk size in zumastor? what's better, big or small? 2008-02-09 18:52 bigger is faster, and wastes more snapshot store space 2008-02-09 18:52 that is another optimization, there is no fundamental reason for bigger to be faster 2008-02-09 18:58 the "create first copy from /dev/null" optimization is not important atm 2008-02-09 18:58 if it takes one day to have replication, so that be 2008-02-09 18:58 let me know how it goes please 2008-02-09 18:58 I will look here 2008-02-09 18:58 changes to the structure of data and snapshots are important 2008-02-09 18:59 ? 2008-02-09 18:59 you mean, exact replication is important? 2008-02-09 18:59 no 2008-02-09 18:59 the optimizations about chunk size, etc 2008-02-09 18:59 right 2008-02-09 19:00 optimizing phase gets started pretty soon, after a round of clean up patches 2008-02-09 19:00 if I have zumastor in production, changing stopping volumes and recreating them is quite bad 2008-02-09 19:00 if this test and a few more I want to do early next week work fine, I'll have zumastor in production by the end of next week 2008-02-09 19:01 beginning of the other week, tops 2008-02-09 19:01 warning: you will be the first other than zumastor team itself to have a zumastor in production 2008-02-09 19:01 I'll have origin and 3 replicas 2008-02-09 19:01 that said, we are pretty good about not hurting the origin volume 2008-02-09 19:02 and only one of the servers is being accessed, the other 3 are for failure tolerance, etc 2008-02-09 19:03 thus I can stop replicas, copy all data and mutate one of the servers into the origin when optimization starts 2008-02-09 19:04 but I need to have the thing in production so that I can start another project 2008-02-09 19:04 (and because our current servers are almost dead, too :-) 2008-02-09 19:07 see you tomorrow (for me :-) 2008-02-09 23:33 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor irc.oftc.net #zumastor log beginning Sun Feb 10 00:00:01 PST 2008 2008-02-10 04:25 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-10 04:26 looks like snapshots are not being taken on the origin while replication is going on 2008-02-10 04:44 wow, replication failed and has been *restarted* from scratch 2008-02-10 04:57 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-10 09:05 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-10 09:15 -!- pgquiles__(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-10 10:23 -!- pgquiles__(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-10 10:46 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-10 10:51 hmm 2008-02-10 11:11 -!- charlesn1(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-10 11:50 replication is still at 28% :-( 2008-02-10 11:50 -!- pgquiles__(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-10 12:44 there's something I don't understand 2008-02-10 12:45 given that my volume had only 100 bytes of real data, which are about 200MB of real data to zumastor, and the rest of the 3TB of data (and 3.25TB of snapshots) are empty 2008-02-10 12:45 why is it taking so long to replicate? 2008-02-10 12:45 it seems that compression is not working 2008-02-10 12:46 (that'd be in addition to the locks issue) 2008-02-10 12:51 dd if=/dev/zero of=./temp bs=16k count=1; gzip -9 temp 2008-02-10 12:52 16384 bytes of zeros get compressed to 56 bytes 2008-02-10 13:02 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-10 13:10 -!- pgquiles__(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-10 15:06 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-10 15:11 aborted again :-/ 2008-02-10 16:27 -!- pgquiles_(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-10 17:05 -!- pgquiles__(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-10 17:28 -!- pgquiles_(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-10 17:31 pgquiles_: did you fill the devices with /dev/zero before running mkfs? 2008-02-10 17:31 s/devices/origin device/ 2008-02-10 17:32 shapor: is that a required step? 2008-02-10 17:32 no, but it would make replication much faster 2008-02-10 17:32 that was re: replication taking so long 2008-02-10 17:33 16384 bytes will only compress to 56 bytes if its filled with 0s 2008-02-10 17:33 not if it has random old data on it 2008-02-10 17:36 We should probably also patch our kernels, per http://it.slashdot.org/article.pl?sid=08/02/10/2011257 2008-02-10 17:36 yep 2008-02-10 17:43 shapor: no 2008-02-10 17:44 shapor: I thought when 'parted' created a new partition, they were automatically filled with zeros 2008-02-10 17:44 its a good optimization to do, its not mentioned in the docs 2008-02-10 17:44 :-/ 2008-02-10 17:44 nope 2008-02-10 17:45 it could be an option to ddsnap initialize i suppose 2008-02-10 17:45 ddsnap initialize --wipe or something 2008-02-10 17:45 er i mean zumastor define volume 2008-02-10 17:45 --wipe 2008-02-10 17:46 since we support dropping zumastor on top of an existing volume that doesn't seem like a good thing to do by default :) 2008-02-10 17:46 pvcreate and lvcreate have a --zero option, but it only wipes the first 4 kbytes 2008-02-10 17:46 hrm 2008-02-10 17:48 shapor: anyways, IIUC what flips told me yesterday, currently replication is slow because reading/writing the chunks is slow due to locks 2008-02-10 17:48 I don't know how much would zero'ing the fs speed up the replication 2008-02-10 17:49 maybe flips can tell 2008-02-10 17:49 it speeds it up a lot 2008-02-10 17:49 i usually do dd if=/dev/zero of=/dev/origin 2008-02-10 17:49 before running mkfs 2008-02-10 17:49 it helps immensely 2008-02-10 17:50 I'll let this 1TB test finish (if it does not fail) and will try that tomorrow 2008-02-10 17:51 what about the -m0 option I'm using? are root-reserved blocks needed? 2008-02-10 17:53 that wont have any effect on ddsnap 2008-02-10 17:53 fs specific stuff all above it 2008-02-10 17:53 ok 2008-02-10 18:11 -!- zumalog(~zumalog@yzf.shapor.com) has joined #zumastor 2008-02-10 18:13 -!- shapor(~shapor@yzf.shapor.com) has joined #zumastor 2008-02-10 20:28 shapor, got an idea why pgquiles_ replication aborts? 2008-02-10 20:29 -!- charlesn1(~charles@cpe-75-84-92-80.socal.res.rr.com) has left #zumastor 2008-02-10 22:10 a quick fix for a running system: http://www.ping.uio.no/~mortehu/disable-vmsplice-if-exploitable.c 2008-02-10 22:23 i just rebooted all my machines with a patched kernel 2008-02-10 22:23 rather than just breaking splice 2008-02-10 22:27 where is the patch? 2008-02-10 22:33 posted to lkml earlier today 2008-02-10 22:35 after this? "kernel 2.6.24.1 still vulnerable to the vmsplice local root exploit" 2008-02-10 22:36 the original exploit no longer works on my 2.6.18 w/ the patch 2008-02-10 22:37 subject line for the patch? 2008-02-10 22:37 obvious one like "vmsplice" doesn't hit it 2008-02-10 22:39 i'm using the package from the debian bug 2008-02-10 22:40 perhaps they just disable it although that sounds unlikely 2008-02-10 22:40 "unofficial" package 2008-02-10 22:40 nothing really uses vmsplice 2008-02-10 22:41 doesn't xen? 2008-02-10 22:42 "nothing" ;) 2008-02-10 22:43 my claim isn't actually grounded in research 2008-02-10 22:43 just posts like this: http://lkml.org/lkml/2006/12/21/215 2008-02-10 22:45 http://shapor.com/vmsplice-debian.patch 2008-02-10 22:45 thats from the debian diff 2008-02-10 22:47 http://lkml.org/lkml/2008/2/10/153 2008-02-10 22:47 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=464953 2008-02-10 22:48 it would be good if the bug were linked from the debian front page 2008-02-10 22:48 its linked from planet.debian.org 2008-02-10 22:48 this is my frist visit to planet.debian 2008-02-10 22:48 I think most folks would go to debian.org 2008-02-10 22:49 mine too, google pointed me there 2008-02-10 22:49 hrm yeah its not linked under "security advisories" on debian.org 2008-02-10 22:49 theres not an official debian advisory/fix out yet 2008-02-10 22:50 tried to understand the exploit by reading it, failed 2008-02-10 22:50 I need to wait for the analysis 2008-02-10 22:50 the patch gives a useful hint 2008-02-10 22:52 the "hotfix" seems dangerous 2008-02-10 22:52 the link you pasted 2008-02-10 22:53 lots of reports of random crashes after running it 2008-02-10 22:53 better off with a patched kernel :) 2008-02-10 22:53 that is true 2008-02-10 22:53 esp since its a modified copy of the sploit 2008-02-10 22:53 the exploit, and the exploit closer, corrupt the kernel memory map 2008-02-10 22:54 patching now 2008-02-10 22:54 yeah, there are at least races 2008-02-10 22:54 though accessability to the patch is pretty bad 2008-02-10 22:54 there's a process flaw there 2008-02-10 22:54 trye 2008-02-10 22:54 true* 2008-02-10 22:54 wrong keywords in the lkml post 2008-02-10 22:55 the patch wasn't even posted with "vmsplice" or "exploit" in the subject line 2008-02-10 22:55 I can fix that ;) 2008-02-10 23:20 flips: grok the exploit? 2008-02-10 23:20 nope 2008-02-10 23:20 got a few clues about it 2008-02-10 23:20 the code is weird 2008-02-10 23:21 the exploit code? 2008-02-10 23:21 *(void **) &(int[2]){0,PAGE_SIZE} 2008-02-10 23:21 wtf 2008-02-10 23:21 seems equivalent to (int[2]){0,PAGE_SIZE} 2008-02-10 23:22 the second one posted is more clear 2008-02-10 23:22 &(int[2]){0,PAGE_SIZE} 2008-02-10 23:22 http://www.milw0rm.com/exploits/5093 2008-02-10 23:22 you're looking at http://www.milw0rm.com/exploits/5092 2008-02-10 23:23 much shorter 2008-02-10 23:23 heh as if intentially obfuscated 2008-02-10 23:24 none of the mmap goofiness 2008-02-10 23:29 all that mmap stuff is determining iov.iov_base 2008-02-10 23:30 looks like it was all replaced with checking /proc/kallsyms for sys_vm86old 2008-02-10 23:31 http://lkml.org/lkml/2008/2/11/25 2008-02-10 23:34 the hole is glaring when pointed out 2008-02-10 23:35 yeah well 2008-02-10 23:35 20/20 hindsight 2008-02-10 23:35 the trampoline is the part I couldn't figure out from the original 2008-02-10 23:35 code address is stored in the page.list.next field 2008-02-10 23:37 i suppose the exploit was discovered by code inspection 2008-02-10 23:38 the exploit could be written shorted, it is unecessary to code get_current in asm 2008-02-10 23:38 shorter I mean 2008-02-10 23:41 how could it be written? 2008-02-10 23:42 i dont get that part 2008-02-10 23:43 current task is obtained by rounding down the stack pointer 2008-02-10 23:43 to a multiple of 4k or 8k depending on the kernel stack size 2008-02-10 23:43 easy to code in C 2008-02-10 23:45 then it scans through looking for the uid for some reason 2008-02-10 23:45 to work on multiple kernel versions? 2008-02-10 23:47 don't know, I've spent more time posting about it than reading it 2008-02-10 23:48 vmsplice will obediently write whatever you want, wherever you want in kernel 2008-02-10 23:48 not too hard to exploit that 2008-02-10 23:49 yeah, glaring 2008-02-10 23:49 how does that get signed off on? 2008-02-10 23:49 nobody looked close 2008-02-10 23:49 :( 2008-02-10 23:50 code artist is axboe 2008-02-10 23:51 noted lkml flamer 2008-02-10 23:51 cant really blame one person 2008-02-10 23:53 can note that he likes to flame people for posting code with small bugs, while creating a zero day exploit himself 2008-02-10 23:54 can also note that he is mia from the bug hunt and fix irc.oftc.net #zumastor log beginning Mon Feb 11 00:00:01 PST 2008 2008-02-11 00:51 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-11 01:53 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-11 02:19 -!- natalie(~nataliep@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-11 02:45 -!- erwan_taf(~erwan@77.1-14-84.ripe.coltfrance.com) has joined #zumastor 2008-02-11 03:25 -!- erwan_taf(~erwan@81.80.43.67) has joined #zumastor 2008-02-11 05:05 -!- erwan_taf(~erwan@81.80.43.67) has joined #zumastor 2008-02-11 05:31 1TB replication successful 2008-02-11 05:31 :-) 2008-02-11 06:35 -!- flips(~phillips@phunq.net) has joined #zumastor 2008-02-11 06:42 -!- pgquiles(~pgquiles@81.202.65.108) has joined #zumastor 2008-02-11 06:59 -!- charlesnw(~charles@ses.siderean.com) has joined #zumastor 2008-02-11 07:08 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-11 08:44 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-11 10:20 shapor, ping? 2008-02-11 10:28 flips pong 2008-02-11 10:28 do we know what is up with pgquiles' replication reset? 2008-02-11 10:29 from his message to the list it sounds like it was successful 2008-02-11 10:29 I like success 2008-02-11 10:30 thanks 2008-02-11 10:31 looks like the devmapper failed remove issue will be taken care of as a side effect of removing squash, no? 2008-02-11 10:32 oh wow, we got our first spam on the list. I thought it was members only 2008-02-11 10:40 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-11 11:13 hi pgquiles 2008-02-11 11:14 I can't reproduce the dmsetup remove bug on my testing machine 2008-02-11 11:14 what is the version of bash you are using? 2008-02-11 11:19 bash 3.2.25 2008-02-11 11:20 what whas the bug with dmsetup? 2008-02-11 11:20 the bug you saw "device-mapper: remove ioctl failed: Device or resource busy" 2008-02-11 11:20 during the 3T replication 2008-02-11 11:21 oh, yes 2008-02-11 11:22 I tried to run the replication test on my machine with several replication cycles triggerred during the initial one 2008-02-11 11:22 and it happened with the 1TB volume last night, too 2008-02-11 11:22 the device of the unused snapshot was successfully removed in my case 2008-02-11 11:23 that is why I am wondering if that is because we use different versions of bash 2008-02-11 11:23 I use bash 3.1.17 2008-02-11 11:24 it might be 2008-02-11 11:25 did you see the "invalid date" error in logs? 2008-02-11 11:25 let me try that version of bash 2008-02-11 11:26 how long did it take replication for you? what size was the volume? 2008-02-11 11:26 i am using 20G origin/50 snapshot, with 5sec replication cycle 2008-02-11 11:27 wonder if volume size matters 2008-02-11 11:27 did the problem happen when you tried smaller volume size? 2008-02-11 11:27 no 2008-02-11 11:27 I think the long replication time is what triggers the problem 2008-02-11 11:28 too long replication => too many snapshots asked => dmsetup fails 2008-02-11 11:28 that, or a dmsetup bug 2008-02-11 11:29 I'm using dmsetup 1.02.20 2008-02-11 11:29 with 5sec replication cycle, I also saw several failed replication triggers while the first one hadn't finished yet 2008-02-11 11:29 how is that possible? 2008-02-11 11:29 I used a 1800 sec replication cycle 2008-02-11 11:29 for the 1TB volume, I mean 2008-02-11 11:30 but you saw the bug from the first failed replication trigger 2008-02-11 11:30 I can't remember if it happened from the first replication 2008-02-11 11:31 let me vpn to work and check 2008-02-11 11:31 I am using dmsetup 1.02.05-1ubuntu1 2008-02-11 11:31 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-11 11:31 ok, I'm in 2008-02-11 11:34 no, it's not from the first replication 2008-02-11 11:34 I'd say it's from the 3rd or 4th one after replication starts 2008-02-11 11:34 no, I'm wrong and you are right 2008-02-11 11:35 Sat Feb 9 21:38:58 CET 2008 /bin/zumastor[4846]: 'hourly' snapshot list is now: 2 4 6 8 12 14 2008-02-11 11:35 Sat Feb 9 21:38:58 CET 2008 /bin/zumastor[4846]: dropping snapshot for zumatest(0) 2008-02-11 11:35 Sat Feb 9 21:46:58 CET 2008 /bin/zumastor[4846]: new snapshot will be '16' 2008-02-11 11:35 error writing to /var/lib/zumastor/volumes/zumatest/targets/dubna/trigger 2008-02-11 11:35 Sat Feb 9 21:46:59 CET 2008 /bin/zumastor[4846]: dropping snapshot for zumatest(16) 2008-02-11 11:35 device-mapper: remove ioctl failed: Device or resource busy 2008-02-11 11:35 Command failed 2008-02-11 11:35 Sat Feb 9 21:46:59 CET 2008 /bin/zumastor[4846]: remove device failed for zumatest(16) 2008-02-11 11:35 right. I only saw " error writing to /var/lib/zumastor/volumes/zumatest/targets/dubna/trigger" and 2008-02-11 11:35 Sat Feb 9 21:56:59 CET 2008 /bin/zumastor[4846]: new snapshot will be '18' 2008-02-11 11:35 error writing to /var/lib/zumastor/volumes/zumatest/targets/dubna/trigger 2008-02-11 11:35 Sat Feb 9 21:56:59 CET 2008 /bin/zumastor[4846]: dropping snapshot for zumatest(18) 2008-02-11 11:35 device-mapper: remove ioctl failed: Device or resource busy 2008-02-11 11:35 Command failed 2008-02-11 11:35 Sat Feb 9 21:56:59 CET 2008 /bin/zumastor[4846]: remove device failed for zumatest(18) 2008-02-11 11:35 " dropping snapshot for ..." 2008-02-11 11:36 no " remove ioctl failed" 2008-02-11 11:36 what's with /var/lib/zumastor/volumes/zumatest/targets/dubna/trigger ? when you write to it, ddsnap takes a snapshot 2008-02-11 11:36 no. when you write to it, 'zumastor target' starts a replication 2008-02-11 11:37 and does 'zumastor target' know what kind of replication it has to perform? (hourl, daily, manual) 2008-02-11 11:37 no. we don't have type of replication at this time 2008-02-11 11:38 replication is triggered periodically, with --period option given to 'define target/source' 2008-02-11 11:38 err type of snapshot, I meant 2008-02-11 11:38 sorry 2008-02-11 11:39 'zumastor master' is the process taking care of snapshot creation. 'zumastor target' takes care of replication and doesn't know anything about hourly or daily snapshots 2008-02-11 11:39 ok 2008-02-11 11:40 what's odd is lsof'ing that device shows no open files, that means there was some open file but it's been already closed 2008-02-11 11:40 maybe I should modify zumastor to log the result of lsof 2008-02-11 11:41 when a replication is triggered, we first write to /var/lib/zumastor/volumes/zumatest/master/trigger 2008-02-11 11:41 receiving that request, 'zumastor master' creates a new snapshot and a new device. it then writes to /var/lib/zumastor/volumes/zumatest/targets/dubna/trigger 2008-02-11 11:42 to trigger a new replication cycle. because 'zumastor target' is still busy at the initial replication, that request fails 2008-02-11 11:42 so we tried to remove the unused device/snapshot 2008-02-11 11:43 it failed in your test because the removing of device failed with EBUSY 2008-02-11 11:44 the particular lines of code is /bin/zumastor: new_target_snapshot: 158-162 2008-02-11 11:45 and there's more fun: 2008-02-11 11:45 Sun Feb 10 14:51:46 CET 2008 /bin/zumastor[4846]: error: snapshot zumatest(220) not found 2008-02-11 11:46 error writing to /var/lib/zumastor/volumes/zumatest/targets/dubna/trigger 2008-02-11 11:46 Sun Feb 10 14:51:46 CET 2008 /bin/zumastor[4846]: dropping snapshot for zumatest(220) 2008-02-11 11:46 Sun Feb 10 14:51:46 2008: [10773] usecount: snapshot server is unable to set usecount for snapshot 220 2008-02-11 11:46 Sun Feb 10 14:51:46 2008: [10773] usecount: server reason for usecount failure: Snapshot tag 220 is not valid 2008-02-11 11:46 Sun Feb 10 14:51:46 CET 2008 /bin/zumastor[4846]: couldn't get usecount for vol 'zumatest' snap '220' 2008-02-11 11:46 Sun Feb 10 14:57:38 CET 2008 /bin/zumastor[4846]: new snapshot will be '222' 2008-02-11 11:46 could the ioctl problem be because I was copying files to the origin while replicating? that shouldn't happen but... 2008-02-11 11:47 that is a bug of ddsnap autodelete, I think. looks like the newly created snapshot was imediately deleted 2008-02-11 11:48 that shouldn't be the problem because the unused snapshot device wasn't mounted 2008-02-11 11:49 it looks more like a race of device creation 2008-02-11 11:49 -!- charlesnw(~charles@ses.siderean.com) has left #zumastor 2008-02-11 11:53 I've modified drop_snapshot in the origin to show the result of lsof 2008-02-11 11:53 pgquiles, could you check dmesg for "control: control: Control thread started, pid=xxxx for snapshot 16" 2008-02-11 11:54 what time of the messge? 2008-02-11 11:55 http://rafb.net/p/lR5a0u48.html 2008-02-11 11:55 that's for the 1TB replication, unfortunately I cannot dmesg for the 3TB case 2008-02-11 11:56 I keep the logs of the origin but nothing more :- 2008-02-11 11:56 thanks for the log 2008-02-11 11:57 hmm, strange that i didn't see "incoming: identify succeeded" 2008-02-11 11:58 that's because I grepped for 'control' :-) 2008-02-11 11:58 ah, i c 2008-02-11 11:58 http://rafb.net/p/y5Qnoz20.html 2008-02-11 11:59 it is hard to tell the timing, but thanks a lot for the log 2008-02-11 12:02 I can paste the whole dmesg output, I you dare to read the 71kb :-) 2008-02-11 12:03 no thanks :) 2008-02-11 12:04 wonder if it is easy for you to set up a smaller volume, like 10G and trigger replication faster 2008-02-11 12:04 to see if the problem still exists 2008-02-11 12:05 and with the lsof you added in drop_snapshot 2008-02-11 12:06 that might be 2008-02-11 12:06 I want to try shapor's suggestion of zeroing the volume before creating the filesystem 2008-02-11 12:06 jiayingz, I wonder if this could be related to the bash regex issue we pqguiles already hit? 2008-02-11 12:06 zeroed chunks should compress very well, therefore making transmission faster 2008-02-11 12:07 flips, doesn't look like so 2008-02-11 12:07 yes, let's see what happens 2008-02-11 12:07 yes, go ahead to try it. 2008-02-11 12:07 pgquiles, I thought you said your gige link was not fully utilized, and cpu was low, so I concluded seeking is the bottleneck 2008-02-11 12:08 but perhaps the gige link was actually saturated, in which case zeroing would help a lot 2008-02-11 12:09 flips: no, network was at 25% usage 2008-02-11 12:09 ideally we would see 100% 2008-02-11 12:10 doing the zeroing test will let us know for sure 2008-02-11 12:10 ok, 1TB data + 1.3TB snapshots, previously zeroed, has been started 2008-02-11 12:10 at least it will be interesting if network usage then drops to 8% or something 2008-02-11 12:10 that is also useful 2008-02-11 12:11 to be 100% sure about the network usage, I'd need to setup SNMP or the ProCurve Manager software and have proper statistics 2008-02-11 12:11 the date I've told you so far is from the web interface, which shows peaks and medium values, but nothing more 2008-02-11 12:12 s/date/data 2008-02-11 12:14 we should keep our own statistics in ddsnap I think. At worst, it would be a cross check, but I think it would actually provide information that can't always be obtained at the network interface 2008-02-11 12:15 that'd be better, of course 2008-02-11 12:17 18GB transferred so far 2008-02-11 12:17 we'll see if zeroing is worth in a moment 2008-02-11 12:17 there were 22GB of non-zero data 2008-02-11 12:24 -!- charlesnw(~charles@dctm.siderean.com) has joined #zumastor 2008-02-11 12:28 network is at 3% now, 8% was the peak 2008-02-11 12:29 has the replication finished? 2008-02-11 12:29 flips: have you tried crashing your system with that vulnerability exploit :) 2008-02-11 12:30 natalie, I got root with it, yes 2008-02-11 12:31 on dana's machine ;) 2008-02-11 12:31 hah, ok, anything else bad happened to it? 2008-02-11 12:32 I rebooted when I realized how badly it trashed the lru list 2008-02-11 12:32 and what is the kernel level on that system? 2008-02-11 12:33 2.6.22 2008-02-11 12:33 jiayingz: no, it's at 4% 2008-02-11 12:33 it is obvious that everything from 2.6.17 through 2.6.24.1 are vulnerable though 2008-02-11 12:33 gaping hole 2008-02-11 12:33 hope it booted ok then 2008-02-11 12:34 it rebooted fine and nothing bad happened 2008-02-11 12:34 I applied the patch, which is trivial, and all is well 2008-02-11 12:35 it is gross that we still do not mention it on kernel.org though 2008-02-11 12:36 you have to read the changelogs 2008-02-11 12:36 pgquiles, I tried bash-3.2.25, still didn't see the problem 2008-02-11 12:36 i saw the two new stable releases announced 2008-02-11 12:36 but you're right nothing announced specifically about this 2008-02-11 12:36 flips: there is a funky program to use the exploit to disable vmsplice entirely. 2008-02-11 12:37 jiayingz: let's see what happens with this test 2008-02-11 12:37 I tried it, it worked, and it is a badly flawed way to close the hole 2008-02-11 12:37 it leaves your vm messed up 2008-02-11 12:37 flips: horribly, horribly flawed, but amusing 2008-02-11 12:37 jiayingz: I cannot really extrapolate because few zeros have been transferred but it seems it's going to take about 10 hours again 2008-02-11 12:39 5% in 29 minutes + or - means full replication would take about 580 minutes 2008-02-11 12:39 pgquiles, when it starts to hit the zeros, things might change 2008-02-11 12:39 flips: I has already started 2008-02-11 12:40 ok, well it looks like my original theory might be right 2008-02-11 12:40 will you let it run to completion? 2008-02-11 12:40 it has transferred 3444890 chunks so far 2008-02-11 12:40 sure, let's stress the hard disks :-) 2008-02-11 12:41 have we got an issue for the replication reset bug? 2008-02-11 12:41 flips: no 2008-02-11 12:42 is there any (more or less) hard-coded timeout for replication in ddsnap? 2008-02-11 12:42 jiayingz, your analysis looks accurate (and fast!) so how about opening the issue? 2008-02-11 12:42 no there isn't 2008-02-11 12:42 this is just a bug in our replication strategy 2008-02-11 12:43 a minor one I think 2008-02-11 12:43 well, with major bad effects 2008-02-11 12:43 but not requiring major redesign 2008-02-11 12:43 ACTION puts on his scramjet shoes 2008-02-11 12:46 I'm going to bed early today 2008-02-11 12:46 I'm really tired :-) 2008-02-11 12:46 see you tomorrow in the morning, I'll post the outcome of the test to the mailing list 2008-02-11 12:47 see you, and thanks 2008-02-11 13:49 -!- charlesnw(~charles@dctm.siderean.com) has left #zumastor 2008-02-11 13:53 -!- charlesnw(~charles@ses.siderean.com) has joined #zumastor 2008-02-11 14:51 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-11 15:19 shpor, ping? 2008-02-11 15:34 flipz pong 2008-02-11 18:12 -!- flips(~phillips@phunq.net) has joined #zumastor 2008-02-11 20:37 hey flips 2008-02-11 20:37 so it turns out we do have 4x uid and gid in task_struct 2008-02-11 20:37 real uid, effective uid, saved-set uid, and filesystem uid 2008-02-11 20:38 and various config options and kernel versions offset them differently, the exploit cleverly just checks for uid,uid,uid,uid,gid,gid,gid,gid 2008-02-11 20:39 saved set uid so the process can switch back from being setuid 2008-02-11 22:03 lhi shapor 2008-02-11 22:04 I dimly remember that from reading a man page some time 2008-02-11 22:04 it does seem a little... less than general 2008-02-11 23:25 flips: so when are you going to fix slow deletes in ext3 :) 2008-02-11 23:26 ah 2008-02-11 23:26 hmm 2008-02-11 23:26 ACTION is waiting for a 15gb log file to go away 2008-02-11 23:27 well, if we can get write performance in ddsnap close to 1.something X then we throw away ext3 and use ext2 instead 2008-02-11 23:27 just make a new snapshot where ext3 would do a journal commit 2008-02-11 23:27 is ext2 any better 2008-02-11 23:27 ? 2008-02-11 23:27 much better 2008-02-11 23:27 the delete badness is journalling 2008-02-11 23:28 and linking together deleted inodes 2008-02-11 23:28 so you dont lose track of indirect blocks or something? 2008-02-11 23:28 I used to know this 2008-02-11 23:29 a bigger journal helps 2008-02-11 23:29 and faster disk ;) 2008-02-11 23:31 I'm attempting to reload cache on that 2008-02-11 23:34 http://marc.info/?l=linux-fsdevel&m=95964244829590&w=2 <- what the orphan list is about 2008-02-11 23:35 it takes care of remembering inodes that are completely unlinked but not actually deleted yet, for one reason or another 2008-02-11 23:36 every time an inode link count goes to zero, the orphan list has to be updated, this is tied together with journal transactions that hopefully delete a lot of inodes, but it still is an extra overhead 2008-02-11 23:37 I seem to recall there is a worse bottleneck too 2008-02-11 23:40 ACTION is inventing new bash syntax 2008-02-11 23:41 what do you think of gzip <- file.log >- file.log.gz 2008-02-11 23:41 the minus denoting fadvise(FADV_NOREUSE) 2008-02-11 23:42 er FADV_DONTNEED 2008-02-11 23:43 it sucks when doing a log file rotation destroys the machines buffer cache 2008-02-11 23:44 heh 2008-02-11 23:44 the shell is really the ideal place to add it 2008-02-11 23:45 I can't immediately see a collision with existing syntax 2008-02-11 23:45 it's a way of making bash incrementally uglier than it already is... 2008-02-11 23:45 :) 2008-02-11 23:45 nice work 2008-02-11 23:45 well therr is already <<- 2008-02-11 23:46 which is the one of <, <<, >>, and > you dont need 2008-02-11 23:46 with files 2008-02-11 23:46 cat <<-EOF 2008-02-11 23:46 blah 2008-02-11 23:46 EOF 2008-02-11 23:46 actually, we could be less brain damaged about caching under obvious serial reading conditions 2008-02-11 23:46 tells it to strip leading whiteapce from the blob 2008-02-11 23:46 not really 2008-02-11 23:46 you never know what the users intention is unless they tell you 2008-02-11 23:46 what if i first grep the log file 2008-02-11 23:46 then decide i want to compress it 2008-02-11 23:47 you can make some very good guesses 2008-02-11 23:47 i'd like it to be cached then 2008-02-11 23:47 part of it should be, the first part 2008-02-11 23:47 well if filesize > physical memory 2008-02-11 23:47 you can make some gross assumptions 2008-02-11 23:47 then when you fill up cache in a straight serial read, start discarding recently read blocks 2008-02-11 23:48 hrm i wonder .. perhaps a per-file cache limit 2008-02-11 23:48 does such a thing already exist? 2008-02-11 23:48 rik van riel likes to poke at stuff like that 2008-02-11 23:48 "dont cache more than 100mb of any one file" 2008-02-11 23:48 i'd love to tell my kernel that 2008-02-11 23:48 I have tried my hand at it, and actually improved cache performance a little 2008-02-11 23:50 now back to how to just tell the os not to cache... 2008-02-11 23:50 the shell idea is interesting 2008-02-11 23:51 but what about cat 2008-02-11 23:51 you want to say "cat and don't cache" as well 2008-02-11 23:51 so I think the syntax should apply to the filename and not the redirection operator 2008-02-11 23:52 or apply to the whole command 2008-02-11 23:52 CACHE=never cat foo | something 2008-02-11 23:53 the problem is that cat has to fadvise, not bash 2008-02-11 23:53 if you say "cat foo" 2008-02-11 23:53 so cat would have to learn about env CACHE 2008-02-11 23:54 i'd rather not teach all the programs i commonly use 2008-02-11 23:54 i considered that approach first 2008-02-11 23:54 so how would cat learn to not cache? 2008-02-11 23:55 cat would have to learn to check CACHE 2008-02-11 23:55 to see if it should fadvise 2008-02-11 23:55 it doens't need to learn if you always just use redirection when you want to instruct it 2008-02-11 23:55 cat < foo | something 2008-02-11 23:55 cat <- foo | something 2008-02-11 23:56 how about an fadvise that was not specific to a given file 2008-02-11 23:56 just "don't cache anything this process touches" 2008-02-11 23:56 that would be nice 2008-02-11 23:56 or rather, make it lowest priority for keeping in cache 2008-02-11 23:57 well i want the bash hack either way and as soon as i finish fighting with yacc i think it will work 2008-02-11 23:58 :) 2008-02-11 23:58 but per-process fadvise would be nice 2008-02-11 23:58 fadvise has been in for a few years now and sadly no one uses it 2008-02-11 23:59 and if it could just be invoked by twitchy op fingers then the cool kids would use it? 2008-02-11 23:59 yeah and they could tell all their friends how they didnt dirty the buffer cache irc.oftc.net #zumastor log beginning Tue Feb 12 00:00:01 PST 2008 2008-02-12 00:00 and feel smrt 2008-02-12 00:00 and get all the girls 2008-02-12 00:01 indeed 2008-02-12 00:01 well, if nobody optimizes cache performance by tomorrow morning then send in your patch 2008-02-12 00:02 per-user cache ulimit would be nice too 2008-02-12 00:03 checked out jiaying's chained replication patch? 2008-02-12 00:03 its easy to piss in everyones cheerios on a system if you have fast io 2008-02-12 00:03 fast linear io 2008-02-12 00:03 but slow random io 2008-02-12 00:03 not yet 2008-02-12 00:03 yes, it should be mandatory for all kernel devs to work on slow machines 2008-02-12 00:03 give them all fit pcs 2008-02-12 00:04 bash has exploded since the last time i looked in it 2008-02-12 00:04 seems like 2x the amount of code in bash 2 2008-02-12 00:05 I hope the new features are groovy enough to justify it 2008-02-12 00:05 so in your case above, I agree that the streaming read and write should not push everything else out of cache 2008-02-12 00:06 we have known that is bad for years, but not done much about it 2008-02-12 00:06 the problem does not yield well to naive attacks 2008-02-12 00:06 which is all we have seen so far 2008-02-12 00:07 the mail system i used to run depended heavily on the buffer cache in order to keep up with mail flow since the backing store was an emc nfs server 2008-02-12 00:07 which was at its limit of iops 2008-02-12 00:07 we saved lots of reads because most messages being read were recently delivered 2008-02-12 00:08 grep the log file and the cache is destroyed, reads hit the nfs server, it slows down, mail backs up, and everyone is unhappy 2008-02-12 00:09 so you become reluctant to grep the log file 2008-02-12 00:09 because the local disk is fast, especially reading a log file linearly 2008-02-12 00:09 the cache for the slow storage is destroyed 2008-02-12 00:09 that another factor 2008-02-12 00:10 so there is a concept called "use once" (invented by me) 2008-02-12 00:10 which is, a page newly entered into cache gets lowest priority unless it is referenced a second time 2008-02-12 00:11 that doesnt help 2008-02-12 00:11 because? 2008-02-12 00:11 because the mail only gets read once too 2008-02-12 00:12 so why does it matter if it gets pushed out of cache? 2008-02-12 00:12 it gets written once, then read once 2008-02-12 00:12 data point: compiling a defconfig 2.6 kernel on the fit pc takes 45 minutes 2008-02-12 00:12 heh 2008-02-12 00:13 shapor: yeah, but it wouldn't get deprioritized after the write, so then it'd likely still be there for the read 2008-02-12 00:13 that's two uses, not one 2008-02-12 00:14 my point is, if the most you use is twice, the "use once" algorithm hurts you 2008-02-12 00:14 flips: what you really want for the specific mail case is "write = high priority, read after write = low priority + your 'single read = low priority'" 2008-02-12 00:14 something like that 2008-02-12 00:14 fiddle until it works :) 2008-02-12 00:14 there is always the magical special case 2008-02-12 00:14 many applications are write-once, read-once 2008-02-12 00:15 esp when there are multiple layers of caching involved 2008-02-12 00:15 It seems like application level hinting is really the only way this debate will ever end 2008-02-12 00:15 the default should suck less 2008-02-12 00:15 flips: exactly 2008-02-12 00:15 flips: yeah, but for which scenario? ;-) 2008-02-12 00:15 per-user and/or per-process limits are nice 2008-02-12 00:16 i think they cover a lot of cases 2008-02-12 00:16 default can be unlimited still 2008-02-12 00:16 it seems sensible that pages brought in by large linear reads should be lower priority than pages just written 2008-02-12 00:16 but easy-to-tweak is the key 2008-02-12 00:16 and everyone understands ulimits 2008-02-12 00:17 flips: not always true 2008-02-12 00:17 yes, when you need to tell it what to do and it really matters, it is nice to have the knobs 2008-02-12 00:17 it doesn't have to be always true 2008-02-12 00:17 it often isn't 2008-02-12 00:17 ACTION imagines most transcoders probably look like "large linear read" followed by "another large linear read" ;-) 2008-02-12 00:17 grpe the same 500mb file on your system 10 times 2008-02-12 00:17 that happens a lot 2008-02-12 00:18 that's "use lots of times" 2008-02-12 00:18 if there is nothing competing with it, it should be cache 2008-02-12 00:18 cached 2008-02-12 00:18 flips: the transcoder isn't. It reads the data exactly 2x, and it needs space for write buffering simultaneously 2008-02-12 00:19 what is the case again? 2008-02-12 00:20 flips: Good quality transcoders make two passes through a source file. The first one they mostly figure out what the data rate is going to be like of the various bits of the file, and then they do a second pass looking at the source file and the new data they wrote and use that to generate the final output. 2008-02-12 00:21 a prescient pager would know to toss out bits of the source file in sync with the corresponding bits of the scratch file, and recognize that both could be ditched as soon as they were read twice. 2008-02-12 00:22 as per usual, the real solution to getting the optimization right is time travel. ;-) 2008-02-12 00:23 cbsmith, the transcoder should be using madvise 2008-02-12 00:23 flips: yup 2008-02-12 00:26 to updatedb remains as much of a problem as it ever was 2008-02-12 00:26 updatedb fire up in the middle of the night and pushes out all your executables 2008-02-12 00:26 etc 2008-02-12 00:40 -!- juuva(juuva@peili.org) has joined #zumastor 2008-02-12 00:49 oy 2008-02-12 00:49 caching is just totally broken in 2.6 for ages 2008-02-12 00:51 howso? 2008-02-12 00:52 took a minute and a half to diff two kernel trees the second time, on this 1.7 ghz pentium m 2008-02-12 00:52 used to take 1.7 seconds, about 2008-02-12 00:53 heh sweet my bash support <- now :) 2008-02-12 00:54 :) 2008-02-12 00:54 you are fast 2008-02-12 00:54 well make your mark on bash and you can die knowing you lived a full life 2008-02-12 00:55 hm do fadv flgs show up in /proc//fdinfo/ ? 2008-02-12 00:56 no clue 2008-02-12 00:56 i need a newer kernel to get fdinfo anyway 2008-02-12 00:56 ACTION has developed a sick habit of editing patches 2008-02-12 00:56 got to concentrate 2008-02-12 00:57 cat <- largefile looks cool 2008-02-12 00:57 it does 2008-02-12 00:58 the - sign even sort of suggests what it does 2008-02-12 01:18 ugh 2008-02-12 01:18 doesn't have the expected behavior 2008-02-12 01:18 wow fadvise is the most worthless system call ever 2008-02-12 01:19 you have to call it after every read() 2008-02-12 01:19 wtf 2008-02-12 01:20 :-D 2008-02-12 01:20 shock 2008-02-12 01:21 time to make your first post to lkml 2008-02-12 01:21 say exactly that 2008-02-12 01:22 no wonder no one uses it 2008-02-12 01:22 it should be called buffercache_evict 2008-02-12 01:22 going to get brave and post that? 2008-02-12 01:23 Subject: fadvise is the most worthless system call ever 2008-02-12 01:23 heh 2008-02-12 01:24 in fact, its worse than that 2008-02-12 01:25 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-12 01:26 good morning 2008-02-12 01:28 hi 2008-02-12 01:29 very late over there :-) 2008-02-12 01:29 look at this: /bin/zumastor: line 766: /proc/fs/nfsd/suspend: No such file or directory 2008-02-12 01:29 that might be the problem with the failing ioctl 2008-02-12 01:30 flips: it lets me evict pages from the buffer cache which someone else needs 2008-02-12 01:30 zeroing does definitely improve times 2008-02-12 01:30 the fitpc takes 45 minutes to compile a kernel using 5 watts, my pentium m takes, um, about 6 minutes at about, um, 50 watts. So the fit pc compiles the kernel with 3.75 watt hours, while the pentium m requires 5 watt hours 2008-02-12 01:30 7h 20 min instead of 10h 30 min 2008-02-12 01:30 (by the way) 2008-02-12 01:30 ahah 2008-02-12 01:31 flips: have you tried tcc to build the kernel? 2008-02-12 01:31 pgquiles: :) 2008-02-12 01:31 shapor 1, the world (particularly pgquiles) 0 2008-02-12 01:31 :-) 2008-02-12 01:31 not a huge improvement, but improves it 2008-02-12 01:31 yeah i'm suprised it was not more 2008-02-12 01:31 I wonder why, because I do not think the network is ever saturated over there 2008-02-12 01:31 most of time spend blocking on io ? 2008-02-12 01:31 I'm surprised it was anything 2008-02-12 01:31 no, never saturated 2008-02-12 01:31 shapor, what I think 2008-02-12 01:32 so why did it improve it at all? 2008-02-12 01:32 curious 2008-02-12 01:32 because we are single threaded 2008-02-12 01:32 and? 2008-02-12 01:32 we dont waste lots of time gziping 2008-02-12 01:32 but cpu is not pegged either 2008-02-12 01:32 well 2008-02-12 01:32 cpus is 90% idle 2008-02-12 01:32 cpu does not have to be pegged 2008-02-12 01:33 I think shapor is right 2008-02-12 01:33 well before it was probably disk pegged, cpu pegged, disk pegged, cpu pegged back and forth 2008-02-12 01:33 yes 2008-02-12 01:33 probably was 20% more cpu used before or so 2008-02-12 01:33 gzip can compress zeros more efficiently that real data 2008-02-12 01:34 what level of gzip is being used? (0 - 9 ?) 2008-02-12 01:34 another interesting test would be to turn compresion off 2008-02-12 01:34 6 i think 2008-02-12 01:34 sounds right 2008-02-12 01:34 turning off compression should saturate the network nicely 2008-02-12 01:34 274 ddsnap transmit $server $host:$port -r -g 6 $old_snap $snap$resume -p $send_file 2008-02-12 01:34 yeah 6 2008-02-12 01:35 pgquiles, it was a useful experiment, I better tag another optimization on to my optimization post 2008-02-12 01:35 "parallelize gzip" 2008-02-12 01:36 ok, so we are convinced that the 30 MB/sec is disk seeking? 2008-02-12 01:36 gzip -6 compresses 16k of zeros to 57 bytes 2008-02-12 01:36 gzip -9 compresses 16k of zeros to 56 bytes 2008-02-12 01:36 unnoticeable 2008-02-12 01:36 now, how can we tell if the source or the target is more disk bound 2008-02-12 01:37 I'd like to reboot the servers and check the disk controller configuration 2008-02-12 01:37 also, take in account I'm using raid 6 2008-02-12 01:37 flips before i make my post i will run my tests with 2.6.24.2 2008-02-12 01:38 btw we gzip an extent at a time, not a chunk 2008-02-12 01:38 yes, be sure 2008-02-12 01:38 right 2008-02-12 01:38 btw, what about several volumes replicating from origin to destination? should I use a different port for each volume? 2008-02-12 01:38 had to do that to make xdelta have a noticable effect 2008-02-12 01:39 pgquiles: yes 2008-02-12 01:39 we should check for that 2008-02-12 01:39 ok 2008-02-12 01:39 and make sure they have unique ports 2008-02-12 01:39 and should we enter an issue to share ports? 2008-02-12 01:39 I'd like to test that next 2008-02-12 01:40 hmm 2008-02-12 01:40 if you use the same port bad stuff will happen 2008-02-12 01:40 flips: I don't think sharing ports is needed 2008-02-12 01:40 and nothing stops/warns you 2008-02-12 01:40 yes, see hmm above 2008-02-12 01:41 flips i'd like to try my test on an hp-ux box, they also implement posix_fadvise 2008-02-12 01:41 I'm opening an issue about port collision 2008-02-12 01:41 thanks 2008-02-12 01:42 i think the fadvise idea was misinterpreted 2008-02-12 01:42 I am thinking about how we can easily determine whether source or target is more disk-bound 2008-02-12 01:42 well if the target has a snapshot set, it certainly would be 2008-02-12 01:42 unless its an order of magnitude faster 2008-02-12 01:44 shapor: or it has a large enough disk cache 2008-02-12 01:44 which is effectively the same 2008-02-12 01:45 if the target is bound thenddsnap replicate on source should show idle time after sending each extent 2008-02-12 01:45 sorry but, what is an extent? 2008-02-12 01:45 it is a bunch of chunks 2008-02-12 01:45 ok 2008-02-12 01:45 how many chunks? 2008-02-12 01:46 as many as are contiguous and changed, up to some limit, I think 1 MB or something 2008-02-12 01:46 ok 2008-02-12 01:47 is data transmission optimized to PDU size to avoid IP fragmentation? 2008-02-12 01:47 adding extents in more places will be one of our big optimizations 2008-02-12 01:48 no 2008-02-12 01:48 733 if (fullvolume) { 2008-02-12 01:48 734 extent_size = chunk_size; 2008-02-12 01:48 yuck 2008-02-12 01:48 why? 2008-02-12 01:49 somebody got beyond lazy methinks 2008-02-12 01:49 ok, that can be tomorrow's optimization ;) 2008-02-12 01:49 :) 2008-02-12 01:49 :-) 2008-02-12 01:50 so, what's the status of 0.7? :-P 2008-02-12 01:50 the status is, I'm working on my revert patch tomorrow 2008-02-12 01:51 and if I keep staying up then that will be the end of that ;) 2008-02-12 01:52 and in other cases the extent size limit is 1m 2008-02-12 01:52 good catch 2008-02-12 01:53 i think i remember catching that in a code review 2008-02-12 01:55 but i also remember an awesome diffstat of the patch 2008-02-12 01:56 maybe i forgot to say something because i was busy runnign diffstat :) 2008-02-12 01:57 $ svn diff -r 364:365 | diffstat ddsnap.c | 633 +++++++++++------------------------------------------ kernel/dm-ddsnap.h | 2 2 files changed, 138 insertions(+), 497 deletions(-) 2008-02-12 01:57 heh 2008-02-12 01:57 making it work properly should be a matter of subtracting code 2008-02-12 01:58 which patch was that? 2008-02-12 01:58 fixing full volume replication 2008-02-12 01:58 r365 2008-02-12 01:58 last april 2008-02-12 01:58 nice 2008-02-12 01:58 wow, that was long ago 2008-02-12 01:58 yeah 2008-02-12 01:59 gack, that code is awful 2008-02-12 02:00 but it has worked for the most part 2008-02-12 02:00 its amazing how much aweful it used to be ;) 2008-02-12 02:00 more aweful* 2008-02-12 02:03 -!- flipz(~phillips@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-12 02:04 theres some bogus factoring in there too 2008-02-12 02:04 chunks_in_extent() 2008-02-12 02:04 called exactly once 2008-02-12 02:05 fixing extent_size for fullvolume should be easy 2008-02-12 02:05 yeah 2008-02-12 02:06 instead of setting num_of_chunks to 1, set it to min(somebignumber, chunksremaining) 2008-02-12 02:07 http://insights.oetiker.ch/linux/fadvise.html 2008-02-12 02:07 so my natural inclination would have been to incorporate the extent coalescing into the chunk output loop with something like num_of_chunks++; continue; 2008-02-12 02:08 though the way it is done with a helper function is not horrible 2008-02-12 02:08 flips ^ the hoops you have to leap through to get sane behavior 2008-02-12 02:08 from the buffer cache :( 2008-02-12 02:10 so what makes you think you have to call it after every read? 2008-02-12 02:10 my test program 2008-02-12 02:10 and the source code 2008-02-12 02:10 http://lxr.linux.no/linux/mm/fadvise.c#L96 2008-02-12 02:10 you have to tell it to invalidate the cache each time you populate it with a page 2008-02-12 02:10 you could do it periodically 2008-02-12 02:11 no persistent flag gets set 2008-02-12 02:11 on the fd 2008-02-12 02:11 omigod, that is crap 2008-02-12 02:11 totally misses the point 2008-02-12 02:11 as you say 2008-02-12 02:11 yeah 2008-02-12 02:12 wow 2008-02-12 02:12 totally not what http://www.opengroup.org/onlinepubs/009695399/functions/posix_fadvise.html 2008-02-12 02:12 was thinking 2008-02-12 02:12 it is pure braindamage 2008-02-12 02:12 I always just assumed it was a fancy vm policy flag as it should be 2008-02-12 02:12 yeah well 2008-02-12 02:13 best part is 2008-02-12 02:13 its linux 2008-02-12 02:13 its open, and it can be fixed 2008-02-12 02:13 creates jobs for hackers 2008-02-12 02:13 i look forward to coding more than sitting on the phone/email with a vendor for the next week 2008-02-12 02:13 ok, it goes on the list of hacking projects, please remind me 2008-02-12 02:14 well i'll cry about it on lkml 2008-02-12 02:14 yes 2008-02-12 02:14 someone else might pick it up 2008-02-12 02:14 looks like your upcoming post is more than justified 2008-02-12 02:16 now wait, which call is at issue, POSIX_FADV_DONTNEED? 2008-02-12 02:17 yeah 2008-02-12 02:18 "It tells the OS that we will not be needing the specified bytes again. The effect of this is, that the bytes will be released from the file system cache." 2008-02-12 02:18 POSIX_FADV_NOREUSE Specifies that the application expects to access the specified data once and then not reuse it thereafter. 2008-02-12 02:18 that just break's 2008-02-12 02:18 noop 2008-02-12 02:19 because that is supposed to be our default behaviour 2008-02-12 02:19 expectation, rather 2008-02-12 02:19 that is the use-once 2008-02-12 02:19 but if i fadvise 2008-02-12 02:19 tell it NOREUSE 2008-02-12 02:19 that means to me "don't put me in the buffer cache" 2008-02-12 02:20 there is no sense in actually evicting it, just put it at the front of the list of things to evict 2008-02-12 02:20 that's what use-once does 2008-02-12 02:21 so until you convince me we violated the man page, hang on to that post ;) 2008-02-12 02:24 "As of this writing (2.6.21) Linux does not remember POSIX_FADV_DONTNEED advice for an open file. It acts when the advice is given, and when it can not comply it forgets the advice." 2008-02-12 02:24 -!- flipz(~phillips@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-12 02:24 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-12 02:24 -!- juuva(juuva@peili.org) has joined #zumastor 2008-02-12 02:24 -!- flips(~phillips@phunq.net) has joined #zumastor 2008-02-12 02:24 -!- mitchg(~mitchg@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-12 02:24 -!- shapor(~shapor@yzf.shapor.com) has joined #zumastor 2008-02-12 02:24 so my naive test of looking in /proc/meminfo is insufficient 2008-02-12 02:24 due to the use-once 2008-02-12 02:24 right 2008-02-12 02:24 you need to know what's at the hot and cold ends of the lru lists 2008-02-12 02:25 although i dont like the fact that something which has old been read once without fdvise 2008-02-12 02:25 will get evicted for my new read once which has been fadvised NOREUSE 2008-02-12 02:25 did you find what solaris does? 2008-02-12 02:26 i dont think they have fadvise 2008-02-12 02:26 didnt look too hard though 2008-02-12 02:28 back in '02 there was an lkml debate over fadvise, to just support more open() O_NOREUSE, etc flags 2008-02-12 02:29 and tweak them later with fcntl 2008-02-12 02:29 none of the fadvise calls do anything persistent 2008-02-12 02:29 sorry, thats just broken 2008-02-12 02:30 except the r_pages tweak 2008-02-12 02:38 shapor, you are right, this is what SUS has to say about DONTNEED: 2008-02-12 02:38 sus? 2008-02-12 02:38 "Specifies that the application expects that it will not access the specified data in the near future" 2008-02-12 02:38 single unix specification, i.e. posix 2008-02-12 02:38 http://www.opengroup.org/onlinepubs/009695399/ 2008-02-12 02:39 yeah thats what i was referring to 2008-02-12 02:39 the linux interpretation is lame 2008-02-12 02:40 just as you say 2008-02-12 02:41 yeah it says nothing about dropping files from cache 2008-02-12 02:41 it allows a malicious user to drop buffer cache of frequently used files 2008-02-12 02:41 its stupid 2008-02-12 02:42 http://lwn.net/Articles/200216/ 2008-02-12 02:46 POSIX_FADV_DONTNEED Specifies that the *application* expects that it will not access the specified data in the near future. 2008-02-12 02:46 but now people are using it as a page cache drop mechanism 2008-02-12 02:46 ugh 2008-02-12 02:47 so are you going to propose that the kernel remember all the different fadvise ranges that can be done for each inode? 2008-02-12 02:48 that sounds like the only way to do it 2008-02-12 02:48 a mask of the flags is sufficient I think 2008-02-12 02:49 ? 2008-02-12 02:49 the vast majority of cases will be all or nothing though 2008-02-12 02:49 Just or all the madvise()'s together for a given page. 2008-02-12 02:49 shapor: I should think the normal case will be nothing or one thing 2008-02-12 02:50 flips: it should at least handle the "all of the file case" 2008-02-12 02:51 The posix_fadvise() function shall advise the implementation on the expected behavior of the application with respect to the data in the file associated with the open file descriptor, fd, starting at offset and continuing for len bytes. The specified range need not currently exist in the file. If len is zero, all data following offset is specified. The implementation may use this information to optimize handling of the specified data. The posix_fadvise() f 2008-02-12 02:51 ok, it just took me more time to remake the patches for ddlink and ddsetup to include the improvement I made on the ride back from disneyland than it did to write the code 2008-02-12 02:51 by about a factor of two 2008-02-12 02:51 due mostly do diff not working in cache 2008-02-12 02:51 so now I'm mad ;) 2008-02-12 02:51 flips: lol 2008-02-12 02:51 :) 2008-02-12 02:51 flips: You in laptop mode or something? 2008-02-12 02:51 laptop mode? 2008-02-12 02:52 flips: yeah, used to prolong battery life. Does all kinds of horrid things with paging rules 2008-02-12 02:52 flips: I use it all the time, but not for the improved performance I'll tell you! 2008-02-12 02:52 I seriously doubt it, this behavior is consistent on all kinds of systems 2008-02-12 02:53 try it on your own 2008-02-12 02:53 diff two kernel trees, repeat, the second time should be fast but isn't 2008-02-12 02:56 root@usermode:~# ddsetup targets 2008-02-12 02:56 crypt v1.5.0 2008-02-12 02:56 striped v1.0.2 2008-02-12 02:56 linear v1.0.2 2008-02-12 02:56 error v1.0.1 2008-02-12 02:56 <- product of the bus ride 2008-02-12 02:56 cool 2008-02-12 02:57 getting close to posting this 2008-02-12 02:59 oh wait... I'm doing a vobcopy in the middle of this. 2008-02-12 03:12 ah, I have too many transcodes going on in the background to make this benchmark meaningful... let me see if I can suspend them for a second. 2008-02-12 03:24 leviathan src # time diff -r -u linux-2.6.23-gentoo-r8 linux-2.6.22.10-zumastor-0.6-r1318 >> /dev/null 2008-02-12 03:24 real 0m15.251s 2008-02-12 03:24 user 0m1.890s 2008-02-12 03:24 sys 0m1.060s 2008-02-12 03:24 leviathan src # time diff -r -u linux-2.6.23-gentoo-r8 linux-2.6.22.10-zumastor-0.6-r1318 >> /dev/null 2008-02-12 03:24 real 0m2.488s 2008-02-12 03:24 user 0m1.760s 2008-02-12 03:24 sys 0m0.720s 2008-02-12 03:24 Looks like it was a tad quicker the second time :-) 2008-02-12 03:25 -!- erwan_taf(~erwan@81.80.43.67) has joined #zumastor 2008-02-12 03:26 And in case you think the diff was small or something. Diffstat reports: 6659 files changed, 165923 insertions(+), 201267 deletions(-) 2008-02-12 03:27 I'll let the transcode jobs run for a while and rerun just to make sure I'm not smoking something. 2008-02-12 03:49 Okay, looks like it was still sitting in cache before. New stats: http://pastebin.com/d71565418 2008-02-12 03:49 Cache is looking good to me. :-) 2008-02-12 04:05 what happens when snapshot number arrives to max_int or whatever the type is? does it start from zero again? (I guess it does) 2008-02-12 09:06 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-12 09:36 -!- charlesn1(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-12 09:53 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-12 10:34 pgquiles: yes 2008-02-12 10:35 jiayingz: thanks 2008-02-12 10:36 pqquiles: did you still see the device mapper failure in your new replication test? 2008-02-12 10:37 jiayingz: yes, it's still present 2008-02-12 10:37 jiayingz: I added that lsof, I'll see what it says 2008-02-12 10:38 jiayingz: btw, in /bin/zumastor, lines 764 and 771, /proc/fs/nfsd/suspend should only be written/read if NFS is in use 2008-02-12 10:38 it's failing in my servers because I do not have NFS 2008-02-12 10:38 pgquiles: that is true 2008-02-12 10:38 I'm opening an issue 2008-02-12 10:39 pgquiles: did you install the zumastor kernel package or you built kernel from source? 2008-02-12 10:40 should be [[ -d /proc/fs/nfsd ]] && echo "foo" > /proc/fs/nfsd/suspend 2008-02-12 10:40 jiayingz: do you mean zumastor.org's kernel package or my package? 2008-02-12 10:40 I'm using my package 2008-02-12 10:40 pgquiles: i c 2008-02-12 10:41 jiayingz: that won't work 2008-02-12 10:41 why? 2008-02-12 10:41 jiayingz: /proc/fs/nfsd does exist but /proc/fs/nfsd/suspect is not writable 2008-02-12 10:41 s/suspect/suspend 2008-02-12 10:42 really? if NFSD is not configured, why it still creates /proc/fs/nfsd 2008-02-12 10:42 that is interesting 2008-02-12 10:43 maybe it's because the kernel was compiled with NFS support but the userspace tools are not installed 2008-02-12 10:43 pgquiles: that should not cause the device mapper failure though. it should do no harm if the write to /proc/fs/nfsd/suspend fails 2008-02-12 10:43 yeah, quite possible 2008-02-12 10:45 yes, I don't think that causes the dm failure, it's just I remembered that about the nfs errors this morning 2008-02-12 10:46 [[ -w /proc/fs/nfsd/suspend ]] && echo ... should fix the error 2008-02-12 10:47 pgquiles: if it is not too much trouble, could you try smaller volume size, like 10G, and set replication cycle smaller, like 5 seconds 2008-02-12 10:48 sure 2008-02-12 10:48 I'm going off line for a minute while I vpn to work 2008-02-12 10:48 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-12 10:49 I'm back 2008-02-12 10:49 that is fast :) 2008-02-12 10:51 openvpn magic :-) 2008-02-12 10:52 Tue Feb 12 18:22:59 CET 2008 /bin/zumastor[4852]: new snapshot will be '116' 2008-02-12 10:52 error writing to /var/lib/zumastor/volumes/zumatest2/targets/dubna/trigger 2008-02-12 10:52 Tue Feb 12 18:24:26 CET 2008 /bin/zumastor[4852]: dropping snapshot for zumatest2(116) 2008-02-12 10:52 Tue Feb 12 18:24:26 CET 2008 /bin/zumastor[4852]: Checking open files for that snapshot with lsof '/dev/mapper/zumatest2(116)' : 2008-02-12 10:52 Tue Feb 12 18:24:26 CET 2008 /bin/zumastor[4852]: 2008-02-12 10:52 Tue Feb 12 18:24:26 CET 2008 /bin/zumastor[4852]: *** END OF LSOF *** 2008-02-12 10:52 device-mapper: remove ioctl failed: Device or resource busy 2008-02-12 10:52 Command failed 2008-02-12 10:52 Tue Feb 12 18:24:26 CET 2008 /bin/zumastor[4852]: remove device failed for zumatest2(116) 2008-02-12 10:52 that empty line is the output of lsof /dev/mapper/zumatest2(116) 2008-02-12 10:52 nothing 2008-02-12 10:52 :-/ 2008-02-12 10:53 wonder that is openvpn problem or lsof problem 2008-02-12 10:54 openvpn? 2008-02-12 10:55 ah, never mind 2008-02-12 10:56 thought your comment about 'openvpn magic' was for the log message :-/ 2008-02-12 10:56 :-) 2008-02-12 10:57 -!- SEJeff(~jeff@office4.tmcs.net) has joined #zumastor 2008-02-12 10:58 could you try dmsetup info /dev/mapper/$vol\($snap\) ? 2008-02-12 10:59 in drop_snapshot 2008-02-12 11:03 done 2008-02-12 11:03 but I'll need to stop & restart zumastor to make sure it reloads /bin/zumastor 2008-02-12 11:04 fact is, when it works (and it works mostly fine), it works fucking well :-) 2008-02-12 11:04 I've just replicated 62GB of data 2008-02-12 11:05 it is good to hear that :) 2008-02-12 11:06 re issue 65 (port collision), is there any place where the target stores the port it is listening to? I cannot find the port in /var/lib/zumastor/volumename/ 2008-02-12 11:09 hmm, there is not 2008-02-12 11:09 I thought it is under /var/lib/zumastor/$vol/source 2008-02-12 11:10 ah, it is under /var/lib/zumastor/$vol/targets/$targetname/port 2008-02-12 11:10 but that's in the origin 2008-02-12 11:10 not in the target 2008-02-12 11:11 right, zumastor is using a mixed push/pull model for replication 2008-02-12 11:11 say you have servers A and B, each one is replicating a different volume to server C 2008-02-12 11:11 server C has no way to avoid A and B from using the same port because C does not know the port 2008-02-12 11:11 we should make it an option of 'zumastor define source' 2008-02-12 11:12 yes, that too 2008-02-12 11:12 but that's not what I meant 2008-02-12 11:12 A ---> 11325:C 2008-02-12 11:12 B ---> 11325:C 2008-02-12 11:12 C should make the decision which port it listens to 2008-02-12 11:12 then B tries zumastor define target with port 11325, C has no way to reject it 2008-02-12 11:13 oops 2008-02-12 11:13 we should specify the port on define source 2008-02-12 11:13 and tells A and B 2008-02-12 11:14 I think the option should be both in define target and in define source 2008-02-12 11:14 pgquiles: right 2008-02-12 11:14 A:43434 --> 11325:C 2008-02-12 11:14 B:49774 --> 11326:C 2008-02-12 11:15 the port to send data for replication is automatically decided 2008-02-12 11:15 zumastor receive start shouldn't take a port 2008-02-12 11:15 thats my brain damage 2008-02-12 11:16 it should have the in the configuration 2008-02-12 11:16 it should return the port though 2008-02-12 11:16 so you can avoid conflicts between different volumes on the same box 2008-02-12 11:16 exactly 2008-02-12 11:16 how are the ACKs from the replica to the origin transmitted? though the replication channel or through ssh? 2008-02-12 11:16 replication channel 2008-02-12 11:17 let me think how to screw zumastor :-P 2008-02-12 11:20 keep doing it and we'll keep fixing up 2008-02-12 11:20 mmm I cannot think of any way in which the origin could receive ACKs from two different replicas in the same port, given that the sending port is automatically decided 2008-02-12 11:20 bugs 2008-02-12 11:20 :-) 2008-02-12 11:20 ? 2008-02-12 11:21 it comes over the same socket 2008-02-12 11:21 that should work just fine 2008-02-12 11:21 same socket it is sending the delta over that is 2008-02-12 11:21 its tcp, not udp 2008-02-12 11:23 I meant I cannot find a case dual to the one I said before (where two origins may try to use the same port in the replica) 2008-02-12 11:23 I cannot think how two replicas could try to send feedback to the same port of an origin 2008-02-12 11:23 the source should choose two different ports 2008-02-12 11:25 wait, just checked the code, we use 4321 as the default port 2008-02-12 11:25 cbsmith, there? 2008-02-12 11:26 in fact the problem is not only with ports but with volume names, too 2008-02-12 11:27 if I use different ports in A and B but use the same name (zumatest) for the volume in A and B, corruption may happen 2008-02-12 11:28 i think the current code assumes A and B use different volume names in that case 2008-02-12 11:29 we do allow A and B use different volume names from C's now 2008-02-12 11:29 so A and B can use the same volume name, but as long as C use two different volume names, it is ok 2008-02-12 11:30 that part of code hasn't been tested though :) 2008-02-12 11:31 mmm 2008-02-12 11:32 there's something wrong in my mind or in zumastor (and I really hope is my mind (-: ) 2008-02-12 11:32 pgquiles: does that make sense to you? 2008-02-12 11:32 zumastor define target vol host:port ==> Replicate this volume to this host through that port, OK 2008-02-12 11:32 right 2008-02-12 11:33 now I do that for a second volume from the same origin: 2008-02-12 11:33 zumastor define target vol2 host:port2 2008-02-12 11:33 but in the replica, I just say 2008-02-12 11:33 to the same target? 2008-02-12 11:33 zumastor define source vol_in_the_replica origin 2008-02-12 11:34 how does 'define source' what volume from the origin (vol1 or vol2) map to vol_in_the_replica? 2008-02-12 11:34 that should work, i think 2008-02-12 11:34 I'm not specifying the port anywhere 2008-02-12 11:34 you mean on target? 2008-02-12 11:35 no, source will tell the target which port it should listen to 2008-02-12 11:36 how? 2008-02-12 11:36 I mean, if the volume name is not the same in the source and in the replica, how does the replica know that vol1 should be written to vol_in_the_replica? 2008-02-12 11:36 when it starts replication with 'zumastor receive start ... ' 2008-02-12 11:37 mmm I don't see it 2008-02-12 11:37 we record that in /var/lib/zumastor/$vol/source/name 2008-02-12 11:37 in zumastor define source , that "vol" has to be the same name that in the origin or might it be different? 2008-02-12 11:38 if you don't give --name option to 'zumastor define source', it will use the same name 2008-02-12 11:38 # zumastor define source 2008-02-12 11:38 usage: /bin/zumastor define source [-p|--period ] [-y|--yes] 2008-02-12 11:38 where's the --name option? 2008-02-12 11:38 if you want to use different names on source and target, you will use --name option with 'zumastor define source' to tell target what source volume name is 2008-02-12 11:39 hmm, there is a bug in 'zumastor define source' usuage message 2008-02-12 11:39 ok, it's there but it's not shown 2008-02-12 11:39 :-) 2008-02-12 11:40 you have found so many bugs :) 2008-02-12 11:40 I was wrong and zumastor is right 2008-02-12 11:40 great! 2008-02-12 11:40 that reminds me of a book by Ian Stewart 2008-02-12 11:41 it went something like: "when we started, we didn't know how to add... now we don't even know if numbers actually mean anything. Mathematically it's a great success" 2008-02-12 11:41 :-D 2008-02-12 11:41 :) 2008-02-12 11:49 67 bugs opened so far 2008-02-12 11:49 I think I can claim 20 of them were opened/inspired by me :-D 2008-02-12 11:53 number does matter :) 2008-02-12 12:03 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-12 12:17 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-12 12:54 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-12 12:55 ooops 2008-02-12 12:55 Starting zumastor 2008-02-12 12:55 Starting volume 'zumatest' 2008-02-12 12:55 start volume error 2008-02-12 12:55 and it does not mount the volume :-/ 2008-02-12 12:57 Tue Feb 12 21:54:38 CET 2008 /etc/init.d/zumastor[4975]: starting volume 'zumatest' 2008-02-12 12:57 Tue Feb 12 21:54:38 2008: [5270] main: Could not open snapshot store /var/lib/zumastor/volumes/zumatest/device/snapstore: No such file or directory 2008-02-12 12:57 Tue Feb 12 21:54:38 2008: [5272] create_socket: Can't connect to control socket /var/run/zumastor/servers/zumatest: No such file or directory 2008-02-12 12:58 root@spectrum:~# lvdisplay 2008-02-12 12:58 /dev/cciss/c0d1: read failed after 0 of 4096 at 0: Input/output error 2008-02-12 12:58 /dev/cciss/c0d1: read failed after 0 of 4096 at 6751104270336: Input/output error 2008-02-12 12:58 /dev/cciss/c0d1: read failed after 0 of 4096 at 0: Input/output error 2008-02-12 12:58 No volume groups found 2008-02-12 12:59 root@spectrum:~# vgdisplay 2008-02-12 12:59 /dev/cciss/c0d1: read failed after 0 of 4096 at 0: Input/output error 2008-02-12 12:59 /dev/cciss/c0d1: read failed after 0 of 4096 at 6751104270336: Input/output error 2008-02-12 12:59 /dev/cciss/c0d1: read failed after 0 of 4096 at 0: Input/output error 2008-02-12 12:59 No volume groups found 2008-02-12 12:59 root@spectrum:~# pvdisplay 2008-02-12 12:59 /dev/cciss/c0d1: read failed after 0 of 4096 at 0: Input/output error 2008-02-12 12:59 /dev/cciss/c0d1: read failed after 0 of 4096 at 6751104270336: Input/output error 2008-02-12 12:59 /dev/cciss/c0d1: read failed after 0 of 4096 at 0: Input/output error 2008-02-12 12:59 it seems that all the data has been lost :-O 2008-02-12 13:08 even the partition has disappeared from 'parted' 2008-02-12 13:08 either severe corruption has happened, or this is a hardware failure 2008-02-12 13:08 I hope it's the latter 2008-02-12 13:10 hm 2008-02-12 13:17 [ 200.421221] Buffer I/O error on device cciss/c0d1, logical block 0 2008-02-12 13:17 [ 200.421297] Buffer I/O error on device cciss/c0d1, logical block 1 2008-02-12 13:17 [ 200.421356] Buffer I/O error on device cciss/c0d1, logical block 2 2008-02-12 13:17 [ 200.421418] Buffer I/O error on device cciss/c0d1, logical block 3 2008-02-12 13:17 [ 200.421535] cciss: cmd dfc80000 has CHECK CONDITION sense key = 0x3 2008-02-12 13:17 [ 200.421540] Buffer I/O error on device cciss/c0d1, logical block 0 2008-02-12 13:17 [ 200.421820] cciss: cmd dfc80000 has CHECK CONDITION sense key = 0x3 2008-02-12 13:17 [ 200.421973] cciss: cmd dfc80000 has CHECK CONDITION sense key = 0x3 2008-02-12 13:17 there are many more of those CHECK CONDITION errors 2008-02-12 13:19 that is not a zumastor or ddsnap error 2008-02-12 13:19 never saw such things before 2008-02-12 13:19 but it probably going to test some of our error paths 2008-02-12 13:19 in ways that have not been tested before 2008-02-12 13:20 wonder if reboot will help 2008-02-12 13:23 jiayingz: I've rebooted twice, it did not help :-/ 2008-02-12 13:23 I'm pasing the disk self test now, will try to enter the disk controller BIOS later 2008-02-12 13:30 mmm 2008-02-12 13:31 I have those disks in RAID 0 for the tests, if one of the disks died that may be the cause 2008-02-12 13:31 all this time I thought I was in RAID 6 :- 2008-02-12 13:35 pgquiles_, the hp array is trying to tell you to replace a disk I think 2008-02-12 13:36 oh yes 2008-02-12 13:36 disk in bay 3 :-/ 2008-02-12 13:36 my god, these disks have been in use for 2 months! 2008-02-12 13:37 but this is a good chance for us to find out how reasonable our error paths are 2008-02-12 13:37 sorry for you being the experiment ;) 2008-02-12 13:37 :-) 2008-02-12 13:37 fortunately I did not buy them through HP 2008-02-12 13:37 those morons only give you 1 year of warranty for SATA disks 2008-02-12 13:39 sorry if any of you work for HP :-) 2008-02-12 13:39 the disk controller said the disk was failing but now seems to be working, so I just enabled it 2008-02-12 13:39 shapor, ping? 2008-02-12 13:39 and it works 2008-02-12 13:39 I see 2008-02-12 13:40 now let's see if zumastor got confused at all after taking that hit 2008-02-12 13:40 volume zumatest works fine and snapshots seem to have been mounted flawlessly 2008-02-12 13:41 can't say the same for volume zumatest2 2008-02-12 13:41 data is there, and it looks like it's not corrupted 2008-02-12 13:41 but snapshots are not there 2008-02-12 13:41 and zumastor status --usage hangs on zumatest2 2008-02-12 13:42 probably the disk is actually failing, because lvdisplay is frozen too 2008-02-12 13:43 oh, snapshots for zumatest2 are being mounted 2008-02-12 13:44 that sounds believable 2008-02-12 13:45 how badly does --usage hang? can you break out? d state? 2008-02-12 13:45 should be able to break out 2008-02-12 13:46 the snapshot server might be in d state though 2008-02-12 13:46 what is d state? 2008-02-12 13:47 what you see in ps if a task is stuck waiting for something in kernel 2008-02-12 13:47 like IO on a dead disk 2008-02-12 13:47 oh, ok 2008-02-12 13:47 I thought it was something specific of zumastor :-) 2008-02-12 13:48 not, zumastor and ddsnap processes are in S or S+ 2008-02-12 13:49 good 2008-02-12 13:49 zumastor status --usage has finished successfully 2008-02-12 13:49 I think it's just it's still checking and mounting 2008-02-12 13:49 ah, so it was just slow 2008-02-12 13:49 we rescan the freespace in the snapshot store on every zumastor --status for now, just to be paranoid 2008-02-12 13:50 only the freespace? 2008-02-12 13:50 the first 1TB volume mounted very quickly (it only contains 22GB of data) 2008-02-12 13:50 some other sanity checks happen as a side effect 2008-02-12 13:50 the second 1TB volume is taking much longer (it contains 147GB of data) 2008-02-12 13:50 right 2008-02-12 13:51 we are nearing the point where we can turn some of the internal checks off by default 2008-02-12 13:51 I'm still not allowed to take snapshots 2008-02-12 13:52 not allowed means? 2008-02-12 13:52 gives an error? 2008-02-12 13:52 oh 2008-02-12 13:52 still statring 2008-02-12 13:52 yes, we must start faster 2008-02-12 13:53 mmm is the lock common to all volumes? I'm not allowed to take snapshots of the first volume, either 2008-02-12 13:54 we don't have any common locks between volumes 2008-02-12 13:54 device mapper may 2008-02-12 13:54 it takes the big kernel lock 2008-02-12 13:54 what do you mean by "not allowed"? what is in the log? 2008-02-12 13:55 a# zumastor snapshot zumatest hourly 2008-02-12 13:55 /bin/zumastor: /var/run/zumastor/locking exists, please wait if zumastor is in the middle of start/stop or call zumastor force-reload to recover 2008-02-12 13:56 but I've just taken a snapshot of zumatest 2008-02-12 13:56 ok, I lied 2008-02-12 13:56 on zumatest2 is just frozen 2008-02-12 13:56 yes we have a startup lock 2008-02-12 13:56 shapor and jiaying worked on that, not me 2008-02-12 13:57 is zumatest2 expected to be stuck because of the disk problem? 2008-02-12 13:57 finding out where it is stuck is useful 2008-02-12 13:57 no, I think it's taking another snapshot on its own 2008-02-12 13:58 /dev/mapper/zumatest2(128) 2008-02-12 13:58 985G 147G 838G 15% /var/run/zumastor/snapshot/zumatest2/2008.02.12-22.53.52 2008-02-12 13:58 I ran zumastor snapshot zumatest2 hourly a couple of minutes later than that time 2008-02-12 13:58 meanwhile... 2008-02-12 13:58 ddcreate foo/bar 100 linear /dev/ubdb 0 2008-02-12 13:58 Error: Invalid argument (bad device name for foo/bar) 2008-02-12 13:58 i think it is still starting zumastor 2008-02-12 13:58 how is that for a nice informative error message from kernel? 2008-02-12 13:59 see, there is a / in foo/bar, an illegal devmapper name 2008-02-12 13:59 what did you do to get that error? 2008-02-12 14:01 I have many "warning: detected unreferenced snapshot: 124" (or other snapshot number). Will zumastor drop/find those snapshots by itself? 2008-02-12 14:07 oh, I am testing an alternative interface to devmapper 2008-02-12 14:07 not connected with zumastor... yet 2008-02-12 14:07 just making the error messages nice now, devmapper has historically been quite confusing with its error messages 2008-02-12 14:08 pgquiles_, I think it wants you to drop it, but ask shapor 2008-02-12 14:08 we have that in there mainly to detect snapshot leaks so we can fix them 2008-02-12 14:09 ok 2008-02-12 14:09 shapor: ping 2008-02-12 14:09 the snapshot on zumatest2 has been taken 2008-02-12 14:10 zumastor is able to recover from a bad situation 2008-02-12 14:10 congrats to everybody, this thing works :-) 2008-02-12 14:13 btw, are zumastor/ddsnap commands atomic? i. e. if one of the steps in the /bin/zumastor is interrupted before the whole function was executed, could that lead to corruption/trouble? 2008-02-12 14:20 -!- mitchg(~mitchg@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-12 14:28 willn: ping 2008-02-12 14:28 pgquiles: that should not lead to corruption 2008-02-12 14:28 jiayingz: great, thanks 2008-02-12 14:28 we have tested randomly shutdown zumastor/ddsnap/machine 2008-02-12 14:28 pgquiles_: pong 2008-02-12 14:28 zumastor should handle those situations 2008-02-12 14:29 shapor: [23:01] I have many "warning: detected unreferenced snapshot: 124" (or other snapshot number). Will zumastor drop/find those snapshots by itself? 2008-02-12 14:30 it should get autodeleted when you run out of snapshots 2008-02-12 14:30 shapor: should I drop those snapshots manually? 2008-02-12 14:30 cool 2008-02-12 14:30 does /bin/zumastor return uniform values for success, fail, etc? 2008-02-12 14:30 its supposed to 2008-02-12 14:31 ok 2008-02-12 14:31 never been extensively tested 2008-02-12 14:31 same here: I've seen 1, 0, etc but have not read exhaustively /lib/zumastor/common and /bin/zumastor 2008-02-12 14:36 what's that "write density" zumastor status reports? (it's 0 in my case) 2008-02-12 14:55 pgquiles_: pong 2008-02-12 14:58 willn: are you maintaining the debian packaging? 2008-02-12 14:59 there are some files (copyright, control, etc) which could use some improvements (nothing important) 2008-02-12 15:09 even replication is working after the crash :-) 2008-02-12 15:12 pgquiles_: I've just started look at them 2008-02-12 15:13 willn: I improved a bit the packaging for Ubuntu 2008-02-12 15:13 willn: added some missing dependencies, etc 2008-02-12 15:13 Trying to get packages built as things change in SVN -- https://launchpad.net/~zumastor-team 2008-02-12 15:13 pgquiles_: Please send patches :) 2008-02-12 15:15 willn: zumastor-team in Launchpad!? 2008-02-12 15:15 Created for a multi-user owned ppa 2008-02-12 15:15 that's new to me 2008-02-12 15:15 oh 2008-02-12 15:16 zumastor, ddsnap, kernel-image-zumastor and kernel-image-xenzumastor are in my PPA 2008-02-12 15:16 https://launchpad.net/~pgquiles/+archive 2008-02-12 15:16 Yea, I saw those -- I'm toying with build-on-commit/test 2008-02-12 15:17 Props on the kernel-image-* metapackages, it is quite handy 2008-02-12 15:20 I noticed today that there is kernel-image-2.6.22-14.51 2008-02-12 15:20 Out of curiosity, how are you going about doing the kpkg builds. Are you using make-kpkg with the buildpackage target? 2008-02-12 15:20 no 2008-02-12 15:20 I'm just ignoring them 2008-02-12 15:20 I took the Ubuntu kernel, then added -zumastor and -xenzumastor flavors 2008-02-12 15:21 -zumastor is -server with ddsnap patches, -xenzumastor is -xen with ddsnap patches 2008-02-12 15:22 ah, groovy 2008-02-12 15:22 then there's also the restricted drivers, ubuntu modules, backport modules, etc 2008-02-12 15:23 I feel bad for the build cluster =] 2008-02-12 15:23 you have to have the whole set if you want to be 100% compatible with the existing kernels, restricted drivers, have apparmor working, etc 2008-02-12 15:23 one thing I have not been able to do is using distcc in a pbuilder 2008-02-12 15:24 that'd be a great improvement because currently it takes more than 5 hours to build a kernel 2008-02-12 15:24 (restricted, etc only a few minutes) 2008-02-12 15:26 Yea 2008-02-12 15:26 Luckily our patches don't change that often 2008-02-12 15:26 yes 2008-02-12 15:26 I see you are building for dapper 2008-02-12 15:27 I'm building for gutsy 2008-02-12 15:27 I started with dapper, provided the build does not blow up as it stands, i'll change the source builder to do all dists 2008-02-12 15:28 ok 2008-02-12 15:28 have you automated source package generation? (cron or anything?) 2008-02-12 15:29 We've got a system to trigger based on new SVN checkins 2008-02-12 15:29 I haven't hooked in yet, want to verify some manual runs first 2008-02-12 15:29 but that will be the way to go eventually 2008-02-12 15:30 mmm 2008-02-12 15:30 does you package report the zumastor version correctly? zumastor --version I mean 2008-02-12 15:39 I do believe they do 2008-02-12 15:40 I'd say they don't 2008-02-12 15:40 (mine don't, either) 2008-02-12 15:40 I get "zumastor revision 1348 built on Mon Feb 11 11:21:48 PST 2008 by build@unassigned 2008-02-12 15:40 downloaded from ppa? 2008-02-12 15:41 No, that's built from the source upload to the ppa 2008-02-12 15:41 Pending on the ppa to finish the build... 2008-02-12 15:41 did you build it with dpkg-buildpackage or with pbuilder? 2008-02-12 15:41 if it does not work, I'll get on fixing it 2008-02-12 15:41 buildpackage 2008-02-12 15:41 yeah :-) 2008-02-12 15:42 problem is this: 2008-02-12 15:42 Makefile uses svn to determine version, but usually your source packages do not contain .svn directories 2008-02-12 15:42 so svn returns nothing 2008-02-12 15:43 mmm I'm seeing a SVNREV file which will make it work 2008-02-12 15:43 I thought we switched to that... let me look 2008-02-12 15:44 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-12 15:45 Yea, if SVNREV is present, the version number should come from there 2008-02-12 15:48 Taking forever to get the build started on those packages... 2008-02-12 15:49 yes, sometimes launchpad is quite slow 2008-02-12 15:50 at the beginning, when it was not public yet, the kde-languages team blocked it for 4 days solid 2008-02-12 15:50 they built all the language packages for all ubuntu versions :-D 2008-02-12 15:52 heh 2008-02-12 15:54 -!- cbsmith(~xman@adsl-71-133-80-65.dsl.irvnca.pacbell.net) has joined #zumastor 2008-02-12 15:54 The joys of shared resources 2008-02-12 16:00 Hooray, first i386 packages started building 2008-02-12 16:01 and their done... Let me check versioning 2008-02-12 16:03 as long as there's a SVNREV file, it'll be fine 2008-02-12 16:06 You've got to wait for arch's to complete before they publish? 2008-02-12 16:08 no 2008-02-12 16:09 packages are published as they are built but there's a delay (usually about 15-20 min) 2008-02-12 16:09 hmm. 2008-02-12 16:09 Thats unfortunate 2008-02-12 16:09 actually, the .deb's are there, but Packages and Packages.gz take a bit more to be available 2008-02-12 16:10 Don't even have debs yet 2008-02-12 16:10 http://ppa.launchpad.net/zumastor-team/ubuntu/pool/main/z/zumastor/ 2008-02-12 16:12 -!- zumalog(~zumalog@yzf.shapor.com) has joined #zumastor 2008-02-12 16:13 -!- dkegel(~chatzilla@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-12 16:21 debs arrived... 2008-02-12 16:23 zumastor revision 1359 built on Tue Feb 12 23:54:55 UTC 2008 by buildd@samarium 2008-02-12 16:23 cool 2008-02-12 16:24 So, for those packages looks like we're ~1 hour turnaround 2008-02-12 16:27 you cannot trust that 2008-02-12 16:27 sometimes building starts immediately, sometimes takes 10 hours 2008-02-12 16:28 Yea, I should have clarified. For those packages that just built :) 2008-02-12 16:30 ah, ok 2008-02-12 16:30 We're still pending for amd64 2008-02-12 16:30 I thought you were trying to make estimation for a future automatization 2008-02-12 16:30 Be nice if this information was scrapable 2008-02-12 16:31 what information? 2008-02-12 16:31 availability? 2008-02-12 16:31 binaries cost about 1/5 of source, in terms of disk 2008-02-12 16:31 Build status 2008-02-12 16:32 it's probably scrapable 2008-02-12 16:32 look at the icon 2008-02-12 16:33 and there's the alt="[BUILDING]" 2008-02-12 16:33 eee, I was hoping for something nice -- rss or something 2008-02-12 16:34 pgquiles: You mentioned you had some changes to the packages we're building? 2008-02-12 16:34 willn: yes 2008-02-12 16:34 (He asked the question!) 2008-02-12 16:34 willn: better copyright, control, etc 2008-02-12 16:35 willn: in my PPA 2008-02-12 16:36 righto. Do you think a section of 'utils' is ok? 2008-02-12 16:37 willn: for the package? 2008-02-12 16:38 do you mean 'utils' instead of 'admin'? 2008-02-12 16:38 er 2008-02-12 16:39 Yea. Admin probably works too, I missed that you had picked that 2008-02-12 16:40 I think ddsnap and zumastor are at least at the level of lvm, which is in admin 2008-02-12 16:40 it made sense to me, which means, err, nothing actually :-) 2008-02-12 16:45 =] 2008-02-12 16:46 please note I'm not a debian or ubuntu developer, just a guy packaging lots of stuff because I hate to have a dirty system 2008-02-12 16:46 drake is the right person to ask what section to place the packages 2008-02-12 16:47 ...ask for what... 2008-02-12 16:48 So, I don't think we need to depend on subversion for the build environment anymore 2008-02-12 16:50 yes 2008-02-12 16:50 did you add bash as a build dependency? 2008-02-12 16:52 Yea 2008-02-12 16:53 Has the patch been rolled into the mainline? 2008-02-12 16:53 I don't think so 2008-02-12 16:53 it's issue 63 2008-02-12 16:53 I sent the patch but the issue is still open, so I'd say patch not applied 2008-02-12 16:55 -!- natalie(~nataliep@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-12 16:56 Hrm. Thats a touchy issue. 2008-02-12 17:06 dkegel: I thought zumastor had moved from sh to bash with no turning back 2008-02-12 17:09 I just applied patches in issue 63 2008-02-12 17:09 pending a review i'm checking in 2008-02-12 17:12 ok 2008-02-12 17:12 google code should allow to remove an attachment without removing the comment it is attached to 2008-02-12 17:13 pgquiles: right, it is unlikely that we will move from bash to sh. The bashisms test is there to remind us that some people think it would be a good idea, though. 2008-02-12 17:14 :-) 2008-02-12 17:21 commited 2008-02-12 17:22 let me kick off a build... 2008-02-12 17:24 willn: the quickest way to have a ppa-like build is to use pbuilder on your system 2008-02-12 17:24 Yeah. We do builds locally.. Theres a certain nice-factor to publicly hosted builds as well 2008-02-12 17:25 indeed 2008-02-12 17:25 when I started using PPA, it was mainly because I had one computer at work and one at home and it was quite inconvenient to carry the packages or scp them from one machine to the other 2008-02-12 17:26 so I just made them public and added the line to my sources.list 2008-02-12 17:26 :-D 2008-02-12 17:27 kind of linus torvalds: "real men make no backups, they upload their stuff to public ftp's for people to replicate it" :-) 2008-02-12 17:36 we need more patches for issues. Its easier to apply a patch than create a patch ;) 2008-02-12 17:37 :-D 2008-02-12 17:42 willn: I have one more patch. I sent it somewhere but I cannot find what issue I attached it to 2008-02-12 17:42 it's about issues 54 and 60 2008-02-12 17:42 maybe I sent it to the mailing list :-? 2008-02-12 17:43 it's a doc patch 2008-02-12 17:49 attached to 54 2008-02-12 17:55 I see 2008-02-12 18:30 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-12 18:44 ACTION considers going home after the last commits.. 2008-02-12 18:46 ACTION goes to bed, it's so late it's almost early 2008-02-12 19:47 I guess with gutsy we're actually building for 3 arch's 2008-02-12 21:10 somebody pinged? irc.oftc.net #zumastor log beginning Wed Feb 13 00:00:01 PST 2008 2008-02-13 00:26 devil may cry is an excellent crazed rampage 2008-02-13 00:26 ACTION doing research 2008-02-13 00:43 -!- erwan_taf(~erwan@81.80.43.67) has joined #zumastor 2008-02-13 01:35 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-13 01:45 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-13 02:02 -!- willn(~wan@pinball.ccs.neu.edu) has joined #zumastor 2008-02-13 03:45 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-13 05:35 shapor: what should I zero? /dev/mapper/zumatest or /dev/mapper/sysvg-test and /dev/mapper/sysvg-test_snap? 2008-02-13 08:02 /dev/mapper/sysvg-test 2008-02-13 08:25 willn: ok, thanks 2008-02-13 08:26 it'd be interesting to include that trick in the howto 2008-02-13 08:41 -!- charlesnw(~charles@ses.siderean.com) has joined #zumastor 2008-02-13 08:49 we were talking about a --zero option to --initialize 2008-02-13 09:25 the latest packages built super fast 2008-02-13 09:25 <5 minutes on the build cluster 2008-02-13 09:37 jiayingz: Are you going to apply the patch in issue 70? 2008-02-13 09:37 or are you pending review for that 2008-02-13 09:38 I am going to apply it 2008-02-13 09:38 assuming no objections 2008-02-13 09:38 lgtm :) 2008-02-13 09:38 thanks for the review ;) 2008-02-13 09:50 I've got the makings of a patch to resolve issue 53 2008-02-13 09:50 Once I finish testing it i'll attach for review 2008-02-13 10:06 http://code.google.com/p/zumastor/issues/detail?id=53 2008-02-13 10:14 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-13 10:22 http://code.google.com/p/zumastor/issues/detail?id=56 2008-02-13 10:34 willn: I like the --zero option. Progress can be reported by kill'ing -USR1 dd 2008-02-13 10:59 ACTION hmms 2008-02-13 11:12 -!- dkegel(~chatzilla@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-13 11:13 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-13 11:14 "well make your mark on bash and you can die knowing you lived a full life" 2008-02-13 11:14 that made my dat 2008-02-13 11:14 day 2008-02-13 11:14 :-) 2008-02-13 11:21 pgquiles: created issue 73 2008-02-13 11:21 willn: ack 2008-02-13 13:27 -!- cbsmith(~xman@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-13 13:50 -!- jiayingz(~jiayingz@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-13 13:57 We need that --go-as-fast-as-possible option when replicating (for machines that are network close, where gzip is major overhead) 2008-02-13 14:28 Created issue 74 to that idea 2008-02-13 14:47 -!- charlesnw(~charles@ses.siderean.com) has joined #zumastor 2008-02-13 14:47 -!- willn(~wan@pinball.ccs.neu.edu) has joined #zumastor 2008-02-13 14:47 -!- natalie(~nataliep@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-13 14:47 -!- mitchg(~mitchg@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-13 14:47 -!- flipz(~phillips@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-13 14:47 -!- shapor(~shapor@yzf.shapor.com) has joined #zumastor 2008-02-13 14:47 -!- jiayingz(~jiayingz@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-13 14:47 -!- dkegel(~chatzilla@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-13 14:47 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-13 14:47 -!- erwan_taf(~erwan@81.80.43.67) has joined #zumastor 2008-02-13 14:47 -!- juuva(juuva@peili.org) has joined #zumastor 2008-02-13 14:47 -!- ChanServ changed topic to "http://www.zumastor.org" 2008-02-13 14:47 how much data store the volumes in the replication tests? 2008-02-13 14:48 -!- flips(~phillips@phunq.net) has joined #zumastor 2008-02-13 14:51 pgquiles: pardon? 2008-02-13 14:51 willn: yes 2008-02-13 14:51 willn: if automated tests are performed with empty volumes, they may work fine 2008-02-13 14:52 but today I tried what jiayingz asked me yesterday: 5GB volume with 5s replication cycle 2008-02-13 14:52 the logs show some errors, more or less the same I had with 1TB volumes and longer (30 min) replication cycles 2008-02-13 14:52 but things started to look worse when I added 2GB of data to that volume 2008-02-13 14:52 I had some errors I had not seen yet 2008-02-13 14:53 Sounds like we should run that same test before we release 0.7 2008-02-13 14:53 Ok. I've got a test running for starters with an 800G empty volume 2008-02-13 14:53 dkegel: I'm quite busy until saturday afternoon but I'll try to find a moment to paste the relevant parts of the logs 2008-02-13 14:54 Thanks... 2008-02-13 14:54 btw, the ioctl problem is still present and lsof says there are no open files, which makes me wonder why is it happening 2008-02-13 14:55 dmsetup info says the snapshot is active at that very moment 2008-02-13 14:55 (I added some loggin to /bin/zumastor) 2008-02-13 14:55 logging 2008-02-13 14:55 once that finishes, ill do a populated test 2008-02-13 14:55 willn: the 800G test will mostly succeed 2008-02-13 14:56 Populated or empty? 2008-02-13 14:56 both of them 2008-02-13 14:56 ok 2008-02-13 14:56 the only problem I saw is if replication cycle is too short, you may hit the snapshot limit 2008-02-13 14:56 then you don't have snapshots until replication finished 2008-02-13 14:56 other than that, it works like a charm 2008-02-13 14:56 I'll try and stress things as much as I can before the release.. The biggest issue is replication time 2008-02-13 14:56 I tried rebooting the servers, unplugging them from network, etc 2008-02-13 14:56 (initial) 2008-02-13 14:57 zumastor always did the right thing 2008-02-13 14:59 pg, you are a walking soundbite generator :-) 2008-02-13 15:00 :-D 2008-02-13 15:00 I see myself as a major blocker for a new relese :-) 2008-02-13 16:30 dkegel, ping 2008-02-13 16:31 pgquiles, we like your kind of blocker 2008-02-13 16:32 pgquiles, I am nearly ready to offer an ioctl-less device mapper setup utility with much better error reporting, that is one way we could investigate this 2008-02-13 16:34 flips: will it be compatible with the current devmapper? (i. e. will I need to recreate my volumes?) 2008-02-13 16:34 pgquiles, strictly compatible, but only implements the commands we use 2008-02-13 16:34 i.e., create, remove, suspend, resume 2008-02-13 16:35 I am not sure we even use the last two, but we might 2008-02-13 16:35 if it works fine with zumastor, it's good enough for me :-) 2008-02-13 16:35 :-) 2008-02-13 16:35 will that be for 0.8? 2008-02-13 16:35 that's what I think. I use it for all my devmapper testing now 2008-02-13 16:36 a reasonable goal 2008-02-13 16:36 it is pretty stable, possibly more stable than the real thing 2008-02-13 16:39 we won't stop searching for the dmsetup remove issue though 2008-02-13 16:39 we have seen funny behavior like that in the past 2008-02-13 16:39 maybe shapor can refresh my memory 2008-02-13 16:40 today I saw some errors I had not seen yet 2008-02-13 16:41 very quick replication cycle, relatively small volume (5G), add quite some data in between snapshots and bang! 2008-02-13 16:41 you lose snapshots but nothing more 2008-02-13 16:41 I cannot remember the details now but I think it was about unsorted snapshots 2008-02-13 16:41 we need it to work reliably even for very quick cycles of course 2008-02-13 16:42 unsorted? 2008-02-13 16:42 s/lose snapshots/miss snapshots 2008-02-13 16:42 flips: IIRC, yes 2008-02-13 16:42 I'll look at the logs tomorrow 2008-02-13 16:42 you mean, snapshot sequence with gaps in it, or actually in the wrong order? 2008-02-13 16:43 wrong order 2008-02-13 16:44 order as seen in the log? 2008-02-13 16:44 but do not trust me too much, I was very tired when I read the logs this morning and I'm very tired now :-) 2008-02-13 16:44 that's ok 2008-02-13 16:44 as seen in the log, yes 2008-02-13 16:44 -!- charlesnw(~charles@ses.siderean.com) has left #zumastor 2008-02-13 16:44 if you have any funny looking log, please post it to the list and we can talk about it 2008-02-13 16:44 I'll do 2008-02-13 16:44 getting late again over there... 2008-02-13 16:45 indeed 2008-02-13 16:45 fortunately I have a very flexible schedule 2008-02-13 16:46 my boss knows I spend hours chatting here and that accounts as work for me :-) 2008-02-13 16:46 s/accounts/counts (broken English here :-) 2008-02-13 16:47 flips, pong 2008-02-13 16:48 hi 2008-02-13 16:49 got a cell phone back on line? 2008-02-13 17:37 root@usermode:~# sh test 2008-02-13 17:37 ddcreate: Invalid argument (unknown target type '0') 2008-02-13 17:38 <- that was me introducing a bug 2008-02-13 17:38 nice error message, hmm? 2008-02-13 17:38 I took the dm target type from the wrong end of the list of strings 2008-02-13 17:38 code that produced the error: 2008-02-13 17:39 if (!target->type) 2008-02-13 17:39 return ddlink_error(ddm->ddi, -EINVAL, 2008-02-13 17:39 "unknown target type '%s'", type); 2008-02-13 17:39 using the fancy new ddlink_error feature 2008-02-13 17:41 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-13 17:43 changed list_add to list_add_tail and it worked 2008-02-13 17:44 in the dark ages before ddlink_error, would have been busy putting in printks right now 2008-02-13 18:42 here comes another one: 2008-02-13 18:42 root@usermode:~# sh test 2008-02-13 18:42 ddcreate: Invalid argument (tried to read only 11 bytes of 12 byte item) 2008-02-13 18:43 not bad for a kernel error message delivered straight to a C program, hmm? 2008-02-13 18:43 if (size > len) 2008-02-13 18:43 return ddlink_error(dd, -EINVAL, 2008-02-13 18:43 "tried to read only %Lu bytes of %u byte item", 2008-02-13 18:43 (unsigned long long)len, size); 2008-02-13 18:44 hmm, I seem to recall there is a special printf option for size_t 2008-02-13 18:47 drat, kernel does not support %z 2008-02-13 18:53 oh wait, it was just me 2008-02-13 18:53 %zi 2008-02-13 18:56 time to go to bed 2008-02-13 18:56 night 2008-02-13 19:33 root@usermode:~# sh test 2008-02-13 19:33 ddcreate: Cannot allocate memory (write failed) 2008-02-13 20:10 eh? 2008-02-13 22:24 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-13 22:55 -!- juuva_(juuva@peili.org) has joined #zumastor irc.oftc.net #zumastor log beginning Thu Feb 14 00:00:01 PST 2008 2008-02-14 00:49 shapor, intentionally triggered the ENOMEM just to make sure the error path worked 2008-02-14 00:49 and then didn't explain just to check if anybody was reading ;) 2008-02-14 01:07 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-14 02:19 -!- ChanServ changed topic to "http://www.zumastor.org" 2008-02-14 02:33 -!- erwan_taf(~erwan@81.80.43.67) has joined #zumastor 2008-02-14 03:02 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-14 07:58 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-14 08:13 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-14 08:14 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-14 08:36 -!- MaZ1(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has joined #zumastor 2008-02-14 08:37 -!- MaZ1(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has left #zumastor 2008-02-14 08:43 -!- charlesnw(~charles@ses.siderean.com) has joined #zumastor 2008-02-14 09:54 :) 2008-02-14 10:35 ACTION waits for data 2008-02-14 10:36 127588321 bytes (128 MB) copied, 25.7182 seconds, 5.0 MB/s 2008-02-14 10:36 It's going to be a long while before the volume is fully populated 2008-02-14 10:44 we need a clever way of knowing which end is the bottleneck so we can attack that first 2008-02-14 10:44 something about whether the upstream replicating process is waiting or not 2008-02-14 10:45 assuming the network is not the bottleneck 2008-02-14 10:47 In this case, the network is 100% not the bottleneck 2008-02-14 10:47 (Two machines plugged into their own gigE switch) 2008-02-14 10:48 Network utilization is about 20KB/s at the moment 2008-02-14 10:48 but, the master is having data populated, so im not expecting anything just yet 2008-02-14 10:50 Writing from /dev/zero to anyfile is peaking at 6 MB/s average at 1.8 MB/s 2008-02-14 10:52 It would be beneficial to try out the same test on machines with >1 spindle 2008-02-14 10:57 that is expected performance for one spindle, until we do some optimizations 2008-02-14 10:58 the current write patch is serial/synchronous/seek intensive 2008-02-14 10:58 write path 2008-02-14 10:58 should be parallel/asynchronous/minimal seeks 2008-02-14 10:59 Replication is working anyway 2008-02-14 11:03 I gave up. Populating the volume offline 2008-02-14 11:05 raw disk write peak at 100 MB/s and average 60 2008-02-14 11:09 remember when all hard disks were 5 mb/sec? 2008-02-14 11:10 Yea. It's painful to revisit (hence my abort of that process) 2008-02-14 13:05 -!- pgquiles(~pgquiles@130.Red-80-39-172.dynamicIP.rima-tde.net) has joined #zumastor 2008-02-14 13:17 ACTION waits for replication 2008-02-14 13:17 Generated 400G of data over lunch while zumastor was offline 2008-02-14 14:02 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-14 14:13 -!- MaZ1(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has joined #zumastor 2008-02-14 14:13 -!- MaZ1(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has left #zumastor 2008-02-14 15:16 -!- MaZ1(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has joined #zumastor 2008-02-14 15:17 -!- MaZ1(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has left #zumastor 2008-02-14 18:35 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-14 21:25 -!- MaZ1(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has joined #zumastor 2008-02-14 21:26 -!- MaZ1(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has left #zumastor 2008-02-14 21:41 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-14 22:05 -!- MaZ1(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has joined #zumastor 2008-02-14 22:50 ACTION gives up for the evening 2008-02-14 22:53 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-14 22:55 thats the spirit 2008-02-14 22:55 :) 2008-02-14 23:02 Digging through bin/zumastor and zumastor/common is making my eyes bleed 2008-02-14 23:02 I'll finish this tommorow 2008-02-14 23:04 feel free to ask any questions about it 2008-02-14 23:05 there are a handful of not obvious things because we're trying to make bash do a lot 2008-02-14 23:45 bash just doesn't get any respect 2008-02-14 23:46 writing daemons in bash is uber elite irc.oftc.net #zumastor log beginning Fri Feb 15 00:00:01 PST 2008 2008-02-15 00:55 -!- MaZ1(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has left #zumastor 2008-02-15 03:00 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-15 04:42 "After a nuclear holocost the only thing left will be cockroaches, twinkies and SCO. If you chop off the heads of SCO lawyers they continue to live for a week. Just when you think they're dead those tiny little litigating arms start moving again." -- slashdot wag 2008-02-15 04:43 ACTION laughs until the tears come 2008-02-15 05:51 -!- MaZ1(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has joined #zumastor 2008-02-15 05:56 -!- MaZ1(~MaZe@c-24-6-86-168.hsd1.ca.comcast.net) has left #zumastor 2008-02-15 06:58 -!- charlesnw(~charles@ses.siderean.com) has joined #zumastor 2008-02-15 07:30 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-15 08:16 -!- ChanServ changed topic to "http://www.zumastor.org" 2008-02-15 08:18 flips: 2142 lines of bash in total between the two. 2008-02-15 09:37 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-15 09:42 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-15 10:41 willn, it provides a lot of functionality for only 2k lines 2008-02-15 10:56 -!- MaZ1(~MaZe@216-239-45-4.google.com) has joined #zumastor 2008-02-15 10:57 -!- MaZ1(~MaZe@216-239-45-4.google.com) has left #zumastor 2008-02-15 11:09 Still. 2008-02-15 11:22 willn, your suggestion for a language to rewrite in? 2008-02-15 11:34 !(*sh) 2008-02-15 11:47 well i'm hungry 2008-02-15 11:48 misfire* 2008-02-15 11:50 willn, haskell then, that will rock 2008-02-15 11:50 hmm 2008-02-15 11:50 ueber eliter 2008-02-15 11:50 Do it in acme::bleach 2008-02-15 11:54 what about C and having a libddsnap and libzumastor, thus making ddsnap and zumastor very simple executables? This would make developing applications which use ddsnap and/or zumastor easier 2008-02-15 11:58 pgquiles: Are you sure your name is not Drake? 2008-02-15 11:58 [He's been suggesting such an idea] 2008-02-15 12:00 willn: I think I suggested that 3 or 4 months ago 2008-02-15 12:01 for instance, I'm developing a web-based GUI 2008-02-15 12:01 and parsing and checking zumastor's output is, well, not ideal 2008-02-15 12:02 I could see that 2008-02-15 12:02 There is also the 'near everything must run as root' issue 2008-02-15 12:04 I took a look at the 'sudo' sources, I wanted to make it a library 2008-02-15 12:05 definitely not nice :-) 2008-02-15 12:05 if I find Nemeth's Unix Handbook, I'll take a look at that sudo source 2008-02-15 12:05 Line 133 in /in/zumastor 2008-02-15 12:06 is mpoint defined in that function? 2008-02-15 12:06 /bin/zumastor rather 2008-02-15 12:08 er 2008-02-15 12:08 yuck 2008-02-15 12:08 it better be, as it's used in line 136 :-) 2008-02-15 12:09 its not, its defined in run_master and inherited 2008-02-15 12:09 (it looks like that anyway) 2008-02-15 12:13 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-15 12:25 pgquiles: yes a agree with that design, the script is a prototype 2008-02-15 12:25 i agree* 2008-02-15 12:48 the only person who doesn't agree with it is the one who has written the most C 2008-02-15 12:49 I consider c++ and python suitable rewrite languages, not C 2008-02-15 12:50 in the mean time, the bash prototype has few actual issues, only theoretical ones 2008-02-15 12:52 flips: Did you do a partial re-write already? 2008-02-15 12:52 yes 2008-02-15 12:52 in about 3 days 2008-02-15 12:52 if that 2008-02-15 12:53 Is it somewhere? 2008-02-15 12:55 yes, I will post it 2008-02-15 12:56 "documentation" 2008-02-15 12:56 it shows the small changes we should make to our filesystem-based database for one thing 2008-02-15 12:56 like moving the trigger pipes into /var/run 2008-02-15 12:56 and how to cleanly factor out start/top of everything from define/forget 2008-02-15 12:57 start/stop I mean 2008-02-15 12:59 as far as libraries go, what's coming is ddsetup, which gives library-like access to ddsnap without the library 2008-02-15 12:59 just needs a header file 2008-02-15 13:02 essentially, the library is implemented in kernel, which is gross but it was already there, embedded in device mapper 2008-02-15 13:02 just needed to be exposed sanely 2008-02-15 13:03 flips: actually I prefer C++ :-) 2008-02-15 13:04 :) 2008-02-15 13:04 and CMake :-) 2008-02-15 13:04 haven't tried it 2008-02-15 13:04 have seen people raving about imake 2008-02-15 13:05 there are introductory slides at my website 2008-02-15 13:05 http://media.ereslibre.es/akademy-es/cmake_intro.pdf 2008-02-15 13:05 well, that's not my website but those are my slides :-) 2008-02-15 13:09 the variable scoping is killing me 2008-02-15 13:09 I will try it out 2008-02-15 13:10 bash variable scoping is ill-formed, yes 2008-02-15 13:10 it's a surface blemish though 2008-02-15 13:11 much worse is not being able to pass arrays to functions for example 2008-02-15 13:11 or pass anything back except by $? and cat 2008-02-15 13:12 its troublesome to read through our code, when its assumed that the only calls to a particular function will provide missing variable declarations (not as passed vars, but just by doing wierd scoping) 2008-02-15 13:13 we try not to do that 2008-02-15 13:13 if you have an example, we should fix it 2008-02-15 13:13 there should be just two types of variables, strictly local and strictly global 2008-02-15 13:15 There are a bunch that stem from run_target (old_snap for instance) 2008-02-15 13:15 mpoint in new_snapshot 2008-02-15 13:17 sounds like cleanup is in order 2008-02-15 13:17 want to post the offending excerpt? 2008-02-15 13:18 shapor can explain why it is actually leet 2008-02-15 13:19 line 133 in /bin/zumastor:new_snapshot is the first case 2008-02-15 13:19 used again in line 136 2008-02-15 13:20 line 225 in /bin/zumastor:replicate_snapshot (old_snap) 2008-02-15 13:21 scramjet time 2008-02-15 13:22 mmm http://article.gmane.org/gmane.linux.kernel/565078/ 2008-02-15 13:22 " 2008-02-15 13:22 I ended up using O_NOATIME for the individual object "open()" calls inside 2008-02-15 13:22 git, and it was an absolutely huge time-saver for the case of not having 2008-02-15 13:22 "noatime" in the mount options. Certainly more than your estimated 10% 2008-02-15 13:22 under some loads. 2008-02-15 13:22 " 2008-02-15 13:29 The filesystem database seems like its impossible to test 2008-02-15 13:51 -!- pgquiles(~pgquiles@130.Red-80-39-172.dynamicIP.rima-tde.net) has joined #zumastor 2008-02-15 14:06 flips: Can you also look at issue 75 2008-02-15 15:44 hi pgquiles 2008-02-15 15:45 do u use ubuntu 7.1 and saw the device mapper problem? 2008-02-15 15:52 -!- charlesnw(~charles@ses.siderean.com) has left #zumastor 2008-02-15 15:55 -!- flips(~phillips@phunq.net) has left #zumastor 2008-02-15 15:59 jiayingz: yes, I use ubuntu 7.1 2008-02-15 15:59 what devmapper problem? 2008-02-15 16:01 that fail to remove device problem 2008-02-15 16:02 yes 2008-02-15 16:02 I am thinking to find a ubuntu 7.1 machine this weekend and see if I can reproduce the problem 2008-02-15 16:02 sure, it's easy 2008-02-15 16:02 :) 2008-02-15 16:02 the easiest way to reproduce the problem would be to create a 5G volume and set a 5 seconds replication cycle 2008-02-15 16:03 if you add 2GB of data, it gets even funnier 2008-02-15 16:03 how funny? 2008-02-15 16:03 I have some logs I want to post to the mailing list about that, some errors I had not previously seen 2008-02-15 16:03 yeah, go ahead to post it 2008-02-15 16:04 I will see if i can get the same problem 2008-02-15 16:04 I've been quite busy the last few days (I have an exam tomorrow :-), thus I've had no time to cut and paste 2008-02-15 16:04 and what is the version of lvm u r using? 2008-02-15 16:04 1.02.20, the one which comes with ubuntu server 2008-02-15 16:05 if you try to create a >2TB volume, you'll need to install my 'parted' package or you'll run into https://bugs.edge.launchpad.net/ubuntu/+source/parted/+bug/107326 2008-02-15 16:08 jiayingz: I must go to be now, I'm sorry. Please e-mail me if you need more info about Ubuntu before Sunday. 2008-02-15 16:08 that was "go to bed" 2008-02-15 16:08 I am not planning to, since you saw the problem with 5G disk 2008-02-15 16:08 thanks for the help! good night 2008-02-15 16:18 -!- dkegel(~chatzilla@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-15 16:18 cmake is good for userspace apps, especially ones that need to be portable to Windows 2008-02-15 16:19 Before we jump off the deep end into a rewrite of zumastor, let's make sure we don't do anything that would keep us from using zumastor on root volumes at boot time... 2008-02-15 16:28 Statically linked C++ might be ok (though I'd prefer plain C), but Python would be crazy. 2008-02-15 16:45 On a scale of crazy -- more or less than 2100 lines of bash? 2008-02-15 18:05 willn: more crazy 2008-02-15 18:05 when you want to run in a small environment, python simply doesn't exist 2008-02-15 18:21 think buffalo linkstation 2008-02-15 18:41 Sure it does 2008-02-15 18:42 ACTION has php, perl, python, whatever on his nslu2 2008-02-15 18:49 ACTION starts the zumastor volume population for the long weekend 2008-02-15 19:15 -!- SEJeff(~jeff@cpe-76-175-171-108.socal.res.rr.com) has joined #zumastor 2008-02-15 19:52 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor irc.oftc.net #zumastor log beginning Sat Feb 16 00:00:01 PST 2008 2008-02-16 02:09 willn: or initrd 2008-02-16 02:09 bash scales down further than python does 2008-02-16 05:41 -!- pgquiles(~pgquiles@130.red-80-39-172.dynamicip.rima-tde.net) has joined #zumastor 2008-02-16 06:12 -!- pgquiles(~pgquiles@130.Red-80-39-172.dynamicIP.rima-tde.net) has joined #zumastor 2008-02-16 12:32 shapor: bash also offers less functionality than any of the other language choices 2008-02-16 12:32 in terms of programmer effort, etc 2008-02-16 13:50 -!- flips(~phillips@phunq.net) has joined #zumastor 2008-02-16 13:50 root@usermode:~# ddsetup deps foo 2008-02-16 13:50 ddsetup: Invalid argument (tried to read 8 bytes of 18 byte item 2008-02-16 13:50 continuing the theme of more informative error messages 2008-02-16 13:54 now... 2008-02-16 13:55 ddsetup deps foo 2008-02-16 13:55 ddsetup: Invalid argument (invalid ioctl 8) 2008-02-16 13:55 reminding me the ioctl isn't implemented in the ioctl handler yet 2008-02-16 13:58 ...and now it is 2008-02-16 13:58 root@usermode:~# ddsetup deps foo 2008-02-16 13:58 (98.16) 2008-02-16 13:58 the implementation is: 2008-02-16 13:58 case DDDEPS: 2008-02-16 13:58 err = dependencies(ddm);; 2008-02-16 13:58 break; 2008-02-16 13:59 + one entry in an enum, and one addition to a table that says it takes one string parameter (dev name) 2008-02-16 13:59 everything else just handled 2008-02-16 13:59 kernel interface programming should always be like this 2008-02-16 14:13 root@usermode:~# ddsetup deps foo 2008-02-16 14:13 foo has 1 target: (98.16) 2008-02-16 14:13 98 being the devmapper major 2008-02-16 14:14 I wonder why the first major is 16? 2008-02-16 14:14 seems odd 2008-02-16 14:15 root@usermode:~# ddsetup ls 2008-02-16 14:15 foo (254.0) 2008-02-16 14:16 echo 0 100 linear /dev/ubdb 0 | ddsetup create foo 2008-02-16 14:16 98 is the major for uml block devices 2008-02-16 14:16 254 is the major for devmapper devices 2008-02-16 15:02 have a dd tree setup yet? 2008-02-16 15:03 on your "git pc" :) 2008-02-16 15:05 writing an abstract for fast right now 2008-02-16 15:05 http://www.usenix.org/events/fast08/wips.html 2008-02-16 15:05 yes, need to dd tree myself this long weekend 2008-02-16 15:06 how about right now 2008-02-16 15:06 ...after the abstract 2008-02-16 15:08 shapor, are you going to be in MTV during fast? 2008-02-16 15:12 yes 2008-02-16 15:12 http://www.usenix.org/events/fast08/wips.html 2008-02-16 15:12 put you down as an author for the poster session? 2008-02-16 15:13 jiaying is doing the poster 2008-02-16 15:13 i can help with it 2008-02-16 15:13 the poster session is described as "during the happy hour" 2008-02-16 15:13 could be fun 2008-02-16 15:14 sounds like my kind of poster session 2008-02-16 15:14 I didn't want to influence your decision by telling you that first ;) 2008-02-16 15:15 so wednesday.. is that the first day of fast? 2008-02-16 15:15 good question 2008-02-16 15:15 I was planning on going up tuesday 2008-02-16 15:15 yeah i might do that too 2008-02-16 15:15 stay till friday, last flight out 2008-02-16 15:15 actually.. i might motorcycle up 2008-02-16 15:16 eaasy rider 2008-02-16 15:16 extend your forks 2008-02-16 15:16 ACTION vomits 2008-02-16 15:16 while (1) { 2008-02-16 15:16 wait_event_interruptible(q->throttle_wait, atomic_read(&q->available) >= need); 2008-02-16 15:16 if (atomic_sub_return(need, &q->available) >= 0) 2008-02-16 15:16 break; 2008-02-16 15:16 atomic_add(need, &q->available); 2008-02-16 15:16 wake_up(&q->throttle_wait); 2008-02-16 15:16 } 2008-02-16 15:16 try that on 2008-02-16 15:16 fix for the FIXME in bio.throttle... I think 2008-02-16 15:17 atomic_sub_return ? 2008-02-16 15:17 _return? 2008-02-16 15:17 atomically returns the resulting value 2008-02-16 15:17 ah 2008-02-16 15:18 that looks simple 2008-02-16 15:18 if another cpu grabbed the resource just after the wiat_event test, it adds it back, wakes everybody and sleeps itself 2008-02-16 15:18 need is the throttle value? 2008-02-16 15:18 took me a moment to realize that without the wake_up it could sleep forever 2008-02-16 15:18 need is the amount of resources needed 2008-02-16 15:19 available is the amount of resources available 2008-02-16 15:19 should write "avail" as is traditional 2008-02-16 15:19 how is "need" defined? 2008-02-16 15:19 right now it is the number of biovecs in the bio 2008-02-16 15:20 an approximation of the number of pages of data that will be transferred 2008-02-16 15:20 how is that good enough? 2008-02-16 15:20 I think I may change that to the maximum number of biovecs in the bio, because that can't change 2008-02-16 15:20 and is not more than 4x the actual number of bvecs 2008-02-16 15:21 which is a loose bound, but not as loose as what kernel uses now 2008-02-16 15:22 re good enough... if the driver is using that metric it needs to be sure it obeys it 2008-02-16 15:22 ->metric is a per driver method 2008-02-16 15:23 so for the one device I have implemented, ddsnap, ->metric is the number of bvecs 2008-02-16 15:24 the thing is, ->metric() must give the same result at endio time as was used as bio submit time, or things will sleep forever 2008-02-16 15:30 ah ok, decided by the driver 2008-02-16 15:32 yes 2008-02-16 15:32 though in practice all drivers will probably decide the same way 2008-02-16 15:33 so making it a method is just a way of appeasing people who think the bound should be tighter (never mind that it is hopelessly losse at present) 2008-02-16 15:43 "The most important aspect for a kernel interface is simplicity. A complex 2008-02-16 15:43 interface is hard to implement correctly and hard to understand, which means 2008-02-16 15:43 application programmers will introduce bugs when trying to use it. 2008-02-16 15:43 Interestingly, it is much harder to come up with a simple interface than 2008-02-16 15:43 it is to create a complex mess." -- How to not invent kernel interfaces, Arnd Bergmann 2008-02-16 15:44 sounds like a manifesto for ddlink 2008-02-16 23:07 -!- dank(~chatzilla@cpe-76-90-56-73.socal.res.rr.com) has joined #zumastor 2008-02-16 23:07 willn: python has horrors I don't want to face again any time soon. It doesn't have a stable enough ABI to use for touchy system tasks. 2008-02-16 23:13 C++ has its own little ABI horrors, but if you do a statically linked binary, you can get away from most of them. 2008-02-16 23:14 Statically linked C++ is about 100 times more suitable for touchy stuff in initrd's than python. irc.oftc.net #zumastor log beginning Sun Feb 17 00:00:01 PST 2008 2008-02-17 00:07 Zumastor itself isn't doing much more than being a 'daemon', touching files, and spawning other processes 2008-02-17 00:10 I agree that a scripting language isn't suitable for touchy system tasks/running in an initrd 2008-02-17 00:15 Doesn't matter what Zumastor is doing. Scripting languages are fine. The problem is Python's unstable and large library and interpreter. 2008-02-17 00:17 Python is just too big to bundle up in an initrd. And python scripts are notoriously sensitive to the version of python. 2008-02-17 00:18 Perl has similar problems, though not as severe. (And no, I'm not suggesting using it.) 2008-02-17 00:19 The right interpreted language might well be fine for zumastor inside initrd, but I haven't heard of anything better than sh there. 2008-02-17 00:20 But I haven't looked. 2008-02-17 02:38 -!- pgquiles(~pgquiles@130.red-80-39-172.dynamicip.rima-tde.net) has joined #zumastor 2008-02-17 10:26 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-17 15:34 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-17 16:29 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-17 17:19 -!- natalie(~nataliep@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-17 18:16 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-17 19:20 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-17 19:39 Zumastor is an open source Linux project that adds enterprise storage features for NFS and CIFS file serving applications. Based on ddsnap, a filesystem-independent virtual block device, Zumastor supports a broad range of Linux filesystems, providing all of them with multiple volume snapshots and multilevel remote replication. Work to date has focussed on functionality and stable operation under heavy load and multiple system failures. Considerable effo 2008-02-17 19:39 rt was invested in improving the Linux block IO subsystem to eliminate deadlock issues that have traditionally affected a number of Linux storage subsystems, including ddsnap. Ongoing work focusses on two main areas: 1) optimizing write throughput to snapshotted volumes and 2) improving flexibility, stability and feature set of Linux volume management to address a range of provisioning capabilities that have emerged recently on other operating systems. --- 2008-02-17 19:39 proposed abstract for fast wip session 2008-02-17 19:39 ACTION pings shapor, dkegel, willn etc 2008-02-17 19:42 Zumastor is an open source project to provide enterprise storage 2008-02-17 19:42 features for Linux NFS and CIFS file serving applications. Based on 2008-02-17 19:42 ddsnap, a filesystem-independent virtual block device, Zumastor supports 2008-02-17 19:42 a broad range of Linux filesystems, providing all of them with multiple 2008-02-17 19:42 volume snapshots, multilevel remote replication and a command level 2008-02-17 19:42 management interface. Work to date has focussed on functionality and 2008-02-17 19:42 stable operation under heavy load and multiple system failures. 2008-02-17 19:42 Considerable effort was invested in improving the Linux block IO 2008-02-17 19:42 subsystem to eliminate deadlock issues that have traditionally affected 2008-02-17 19:42 a number of Linux storage subsystems, including ddsnap. Ongoing work 2008-02-17 19:42 focusses on two main areas: 1) optimizing write throughput to 2008-02-17 19:42 snapshotted volumes and 2) improving flexibility, stability and feature 2008-02-17 19:42 set of Linux volume management to address a range of provisioning 2008-02-17 19:42 capabilities that have emerged recently on other operating systems. 2008-02-17 19:42 -- slightly improved 2008-02-17 22:06 ddsetup targets 2008-02-17 22:06 ramback v0.0.0 2008-02-17 22:06 crypt v1.5.0 2008-02-17 22:06 striped v1.0.2 2008-02-17 22:06 linear v1.0.2 2008-02-17 22:06 error v1.0.1 2008-02-17 23:54 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor irc.oftc.net #zumastor log beginning Mon Feb 18 00:00:01 PST 2008 2008-02-18 01:04 -!- erwan_taf(~erwan@81.80.43.67) has joined #zumastor 2008-02-18 02:09 good morning europe 2008-02-18 02:09 holiday monday in usa 2008-02-18 03:29 lo flips 2008-02-18 08:24 hi flips 2008-02-18 08:24 s/command level/command line/ 2008-02-18 08:26 looks pretty good otherwise 2008-02-18 09:41 -!- charlesn1(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-18 10:34 I actually meant command level... 2008-02-18 10:34 well I guess it sounds a little unusual 2008-02-18 10:46 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-18 11:24 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-18 12:43 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-18 12:46 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-18 13:10 -!- charlesn1(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-18 14:00 -!- dank(~chatzilla@cpe-76-90-56-73.socal.res.rr.com) has joined #zumastor 2008-02-18 14:07 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-18 14:09 Where was Will scraping launchpad.net status for zumastor-team? 2008-02-18 14:10 Found it, https://launchpad.net/~zumastor-team 2008-02-18 14:12 I guess we're not building kernel packages there yet, though. 2008-02-18 14:15 dank: no 2008-02-18 14:15 dank: but it would be a better place than my PPA 2008-02-18 14:16 Yeah. Last week or so I asked Will to build kernel packages like you do. I think he's still thinking about it. 2008-02-18 14:16 I have a fire to suffocate tomorrow morning but I hope I will be able to work on 2.6.22-14.52-zumastor in the afternoon 2008-02-18 14:17 building packages the way I do (it actually is "the ubuntu way", no my invention) is the only way to get those kernels in the official repositories 2008-02-18 14:17 So I've been telling Will. 2008-02-18 14:20 unless we try to build our own distribution, some ubuntu server 2008-02-18 14:20 + zumastor 2008-02-18 14:20 + virtualization 2008-02-18 14:20 + nice GUI 2008-02-18 14:21 which is more or less what I'm trying to do at work :-) 2008-02-18 14:39 GUI? 2008-02-18 14:45 dank: web based 2008-02-18 14:45 instead of typing pvcreate, lvcreate, zumastor define volume, etc, you do that in a QtParted-like web GUI 2008-02-18 14:46 which happens to be a plugin for a more generic management console (a la Microsoft Management Console) I've developed for internal use 2008-02-18 14:46 still a long way to go but it's nice and easy to use :-) 2008-02-18 14:51 Is it something we can try yet? 2008-02-18 14:51 mmm not really, still too clunky and tied to my test system 2008-02-18 14:52 but I'll tell you as soon as it's usable 2008-02-18 14:54 bedtime 2008-02-18 16:30 -!- crumb(~crumb@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-18 22:01 -!- dank(~chatzilla@cpe-76-90-56-73.socal.res.rr.com) has joined #zumastor 2008-02-18 22:07 1.1 General License Grant. Microsoft grants to you a personal, non-exclusive, nontransferable, royalty-free license to use the Software, and to make and use five (5) copies of the Software on one or more computers located at your premises solely for the purpose of designing, developing and testing drivers that operate in conjunction with the Software for use with Microsoft Windows 2000 Professional, Microsoft Windows 2000 Server, 2008-02-18 22:07 ACTION wonders wether that makes the ddk examples unusable for wine. 2008-02-18 22:20 Yeah, I'd stay away. Better to look in the public domain... 2008-02-18 22:20 Was afraid of that. :-/ 2008-02-18 22:21 But it'd be fair game if you were writing a portable driver, I suppose. 2008-02-18 22:30 Well, there are other clauses that prevent distributing source code of your own drivers. 2008-02-18 23:48 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor irc.oftc.net #zumastor log beginning Tue Feb 19 00:00:01 PST 2008 2008-02-19 00:39 -!- flips(~phillips@phunq.net) has joined #zumastor 2008-02-19 00:59 -!- zumalog(~zumalog@yzf.shapor.com) has joined #zumastor 2008-02-19 01:01 -!- shapor(~shapor@yzf.shapor.com) has joined #zumastor 2008-02-19 01:41 ddsnap.c: In function `main': 2008-02-19 01:41 ddsnap.c:2232: error: `PR_SET_NAME' undeclared (first use in this function) 2008-02-19 01:41 ddsnap.c:2232: error: (Each undeclared identifier is reported only once 2008-02-19 01:41 ddsnap.c:2232: error: for each function it appears in.) 2008-02-19 01:41 hm, thats new 2008-02-19 01:41 ddsnap used to compile under debian sarge without errors 2008-02-19 03:53 PR_SET_NAME? 2008-02-19 03:54 hmm 2008-02-19 03:56 my kernel headers are old (2.6.8) 2008-02-19 03:56 pre PR_SET_NAME i guess 2008-02-19 04:00 right 2008-02-19 04:00 that is the problem 2008-02-19 04:00 didn't have the problem before because we weren't using it 2008-02-19 04:01 for basic ddsnap functionality I just run it using files as the underlying devices in user space 2008-02-19 04:01 as a regular user 2008-02-19 04:01 the mlockall() also caused it to fail 2008-02-19 04:01 Operation not permitted 2008-02-19 04:01 caused what to fail? 2008-02-19 04:01 had to add a #define and change an error() to a warn() to test 2008-02-19 04:02 the server start 2008-02-19 04:02 it errored out 2008-02-19 04:02 my "test as a regular user in debian sarge" changes look like: 2008-02-19 04:02 +#ifndef PR_SET_NAME 2008-02-19 04:02 +#define PR_SET_NAME 15 2008-02-19 04:02 +#endif prctl(PR_SET_NAME, process_name); 2008-02-19 04:03 if (mlockall(MCL_CURRENT|MCL_FUTURE)) 2008-02-19 04:03 - error("Unable to lock self into RAM: %s", strerror(errno)); 2008-02-19 04:03 + warn("Unable to lock self into RAM: %s", strerror(errno)); 2008-02-19 04:03 unable to mlockall because of running in small memory? 2008-02-19 04:04 warn is fine there 2008-02-19 04:04 Mon Feb 18 19:19:26 2008: [7335] start_server: Unable to lock self into RAM: Operation not permitted 2008-02-19 04:04 yeah 2008-02-19 04:04 it should be a warn() 2008-02-19 04:04 shall i commit that part? 2008-02-19 04:05 yes 2008-02-19 04:06 operation not permitted? "EPERM (Linux 2.6.9 and later) the caller was not privileged (CAP_IPC_LOCK) and its RLIMIT_MEMLOCK soft resource limit was 0" 2008-02-19 04:06 so was RLIMIT_MEMLOCK soft resource limit 0? 2008-02-19 04:06 max locked memory (kbytes, -l) unlimited 2008-02-19 04:06 but i'm on 2.6.8 2008-02-19 04:07 pre 2.6.9 only root can mlock iirc 2008-02-19 04:07 you are building on 2.6.8 to run on something current? 2008-02-19 04:07 no i'm building on 2.6.8 to run on 2.6.8 2008-02-19 04:08 only the user space component though 2008-02-19 04:08 why? 2008-02-19 04:08 why not 2008-02-19 04:08 issues like above? 2008-02-19 04:08 because its the kernel which is running on this machine 2008-02-19 04:08 and all the hardware in the laptop works with it :) 2008-02-19 04:09 good enough 2008-02-19 04:09 and running as nonroot, just for fun? 2008-02-19 04:09 yes 2008-02-19 04:09 can't say I haven't done it myself 2008-02-19 04:09 i can still create/delete snapshots 2008-02-19 04:10 test usecount/priority stuff 2008-02-19 04:10 right 2008-02-19 04:10 been there 2008-02-19 04:10 you should get one of the fit pcs 2008-02-19 04:10 i was thinking about it 2008-02-19 04:10 for driving my 42" lcd 2008-02-19 04:10 although i dont know if it has the required horsepower 2008-02-19 04:11 what resolution? 2008-02-19 04:11 1366x768 2008-02-19 04:11 should be fine 2008-02-19 04:11 it has no trouble with 1024x768 2008-02-19 04:11 has 2D acceleration 2008-02-19 04:12 well i want to watch dvd's and downloaded mpegs on it 2008-02-19 04:12 it's actually very snappy 2008-02-19 04:12 i do have an external usb dvd drive 2008-02-19 04:12 snappier than my pentium M, which is running older X etc 2008-02-19 04:12 it would be nice to have something small 2008-02-19 04:12 mplayer is smooth 2008-02-19 04:12 play any full-res hd video ? 2008-02-19 04:13 it needs to stream from a cifs/nfs share and decode 2008-02-19 04:13 might start to groan on hd 2008-02-19 04:13 yeah i was thinking that 2008-02-19 04:14 what i should really do is consolidate my server/media pc 2008-02-19 04:14 need a smaller form factor pc though 2008-02-19 04:14 that can accomidate a few ide drives 2008-02-19 04:15 there is a really nice small form factor pc 2008-02-19 04:15 the fit pc is sexy though 2008-02-19 04:15 whose name I have temporarily forgotten 2008-02-19 04:15 yes, it's worth getting just on general principle 2008-02-19 04:15 i could attach it to the back of the display 2008-02-19 04:15 this other one... got to jog my memory 2008-02-19 04:15 i saw a another one advertised in LJ i think 2008-02-19 04:15 you can throw it in your backpack when you head to mtv 2008-02-19 04:16 much lighter than a laptop 2008-02-19 04:16 linutop 2008-02-19 04:16 http://www.linutop.com/ 2008-02-19 04:17 looks very similar in design to the fit 2008-02-19 04:19 no hard drive 2008-02-19 04:21 yeah you get more for the money with the fitpc 2008-02-19 04:21 i was playing with openwrt on a spare WRT54GS someone gave me 2008-02-19 04:21 6MB of very slow flash 2008-02-19 04:21 not that much fun 2008-02-19 04:22 nice piece of harware which has a going price of about $20 used 2008-02-19 04:22 aopen minipc 2008-02-19 04:22 or free 2008-02-19 04:23 http://minipc.aopen.com/Global/spec.htm 2008-02-19 04:23 same form factor as mac mini, lots more connectors 2008-02-19 04:23 geez core2 duo 2008-02-19 04:23 thats like a super computer in this house 2008-02-19 04:23 no really expensive either 2008-02-19 04:24 http://www.nextag.com/MP965_-_DR/search-html 2008-02-19 04:24 barebones 2008-02-19 04:24 add processor memory and hard disk 2008-02-19 04:25 includes the slot loading dvd 2008-02-19 04:25 oh writable 2008-02-19 04:27 would be nice if it had 2x hotswap sas drives ;) 2008-02-19 04:27 probably significantly more expensive though 2008-02-19 04:27 this is next on my gadget aquisition list I think 2008-02-19 04:27 all my drives are IDE or SCSI so i'd have to buy new drives anyway 2008-02-19 04:27 flips: no you need an lcd first ;) 2008-02-19 04:27 true 2008-02-19 04:28 http://www.nextag.com/Samsung-LN-T5271F-52-558269665/prices-html 2008-02-19 04:28 though the 81 series with led backlight is tempting... $1K more 2008-02-19 04:29 hm 2008-02-19 04:29 led backlight.. nice 2008-02-19 04:30 contrast ration is apparently stunning 2008-02-19 04:30 http://www.nextag.com/Samsung-LN-T5281F-52-558108551/prices-html 2008-02-19 04:30 more than $1k more 2008-02-19 04:32 i've found zumastor really useful at home 2008-02-19 04:32 really! 2008-02-19 04:32 i dont need to have my server on at all times 2008-02-19 04:32 i can turn it off 2008-02-19 04:32 turn it back on, have it replicate 2008-02-19 04:33 hey, time to write a blog 2008-02-19 04:33 with a really low power pc like the fit thing 2008-02-19 04:33 power savings would be huge 2008-02-19 04:34 fit pc is my most favorite pc ever 2008-02-19 04:34 dana's too 2008-02-19 04:35 really need to work on single spindle performance first though 2008-02-19 04:35 before its practical for a really low end machine 2008-02-19 04:35 ok for real... zzz 2008-02-19 04:35 yeah me too 2008-02-19 04:35 5 hours to meeting time 2008-02-19 04:35 woo 2008-02-19 04:36 send my regards ;) 2008-02-19 07:52 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-19 09:08 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-19 09:22 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-19 09:23 -!- Tim_vimm_(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-19 10:19 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-19 10:29 -!- cbsmith(~xman@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-19 11:13 -!- mitchg(~mitchg@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-19 12:01 dkegel: you got quoted on cnet 2008-02-19 12:41 Yeah, I saw that. The joys of posting on the Google blog. 2008-02-19 12:43 twice, even. http://www.cnet.com/4244-5_1-0.html?query=kegel&tag=srch&target=nw 2008-02-19 14:16 ddsetup now has a majority of dmsetup commands implemented is is only 300 lines long 2008-02-19 14:16 missing quite a few options to be fair 2008-02-19 14:17 and options parsing, maybe this is a good time to write one of those 2008-02-19 19:34 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-19 19:47 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-19 21:03 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-19 21:05 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor irc.oftc.net #zumastor log beginning Wed Feb 20 00:00:01 PST 2008 2008-02-20 01:45 -!- erwan_taf(~erwan@81.80.43.67) has joined #zumastor 2008-02-20 02:47 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-20 07:35 -!- charlesnw(~charles@ses.siderean.com) has joined #zumastor 2008-02-20 07:58 -!- aSiDaRBe(~KLBKvTurk@78.183.12.108) has joined #zumastor 2008-02-20 07:58 -!- aSiDaRBe(~KLBKvTurk@78.183.12.108) has left #zumastor 2008-02-20 08:48 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-20 10:05 interesting: http://lkml.org/lkml/2008/2/20/287 2008-02-20 10:05 Add an FS-Cache cache-backend that permits a mounted filesystem to be used as a 2008-02-20 10:05 backing store for the cache. 2008-02-20 12:27 -!- cbsmith(~xman@adsl-71-133-80-65.dsl.irvnca.pacbell.net) has joined #zumastor 2008-02-20 14:34 -!- cbsmith(~xman@64.148.65.12) has joined #zumastor 2008-02-20 14:42 -!- xman(~xman@64.148.65.12) has joined #zumastor 2008-02-20 15:16 jiayingz: ping 2008-02-20 15:28 hi pgquiles 2008-02-20 15:30 jiayingz: hi. I was setting up my servers to test the ioctl issue but reading (instead of just skimming over (-: ) the e-mails it seems to me that you are preparing a patch to deal with the failing udev rule. Do you want me to wait for the patch, or do you want me to remove the udev rule and test it now? 2008-02-20 15:32 pgquiles: let me post the zumastor to the public mailing list now. 2008-02-20 15:33 ok 2008-02-20 15:33 it is just a quick hack 2008-02-20 15:33 the real problem is on device mapper code, I think 2008-02-20 15:34 probably 2008-02-20 15:35 re-reading your description of the problem, I see my suggestion of udevtrigger was pointless as the problem was a step before udevtrigger could do any good 2008-02-20 15:35 unfortunately since thursday I've been quite busy preparing a demo for today 2008-02-20 15:36 you already created enough bug reports to keep us busy ;) 2008-02-20 15:37 O:-) 2008-02-20 15:37 udevtrigger seems a new thing added in gusty 2008-02-20 15:37 i am not familar with udev, wonder what udevtrigger does 2008-02-20 15:38 it's a replacement for walk_sysfs 2008-02-20 15:38 or at least that's what http://linuxfromscratch.org/pipermail/lfs-dev/2006-April/056489.html says 2008-02-20 15:38 I had no idea either :-D 2008-02-20 15:38 the patch was posted 2008-02-20 15:39 I'm going to try it 2008-02-20 15:40 udevtrigger has been there since at least edgy 2008-02-20 15:40 http://packages.ubuntu.com/edgy/i386/udev/filelist 2008-02-20 15:44 i c. we are using dapper 2008-02-20 15:44 so i did not find it :-D 2008-02-20 15:45 I cannot talk for dapper, I've not got a dapper installation here and packages.ubuntu.com won't show the list of files in the 'udev' package 2008-02-20 15:46 it might be interesting to test zumastor with hardy, too, after all, it's going to be the next stable and discovering bugs now would be better than after the release 2008-02-20 15:49 that is good suggestion 2008-02-20 15:50 we should test zumastor on all popular debian releases 2008-02-20 15:50 wtf! replication was so fast I though it was not working :-D 2008-02-20 15:50 pgquiles: how did that happen? ;) 2008-02-20 15:52 pgquiles, did you zero the origin device before the replication? 2008-02-20 15:52 Dan's post to the Google blog made slashdot. 2008-02-20 15:53 yes, zeroed and it only contained a 100 bytes text document 2008-02-20 15:53 I'm dumping data on it now 2008-02-20 16:00 patch is not working properly 2008-02-20 16:00 Thu Feb 21 00:59:35 CET 2008 /bin/zumastor[5827]: dropping snapshot for zumatest(578) 2008-02-20 16:00 device-mapper: remove ioctl failed: Device or resource busy 2008-02-20 16:01 that's in the origin 2008-02-20 16:01 I have many of those 2008-02-20 16:02 in the replica, I have this one (only one) 2008-02-20 16:02 Thu Feb 21 00:52:58 CET 2008 /bin/zumastor[32673]: dropping snapshot for zumatest(460) 2008-02-20 16:02 Unable to unlink device node for 'zumatest(460)' 2008-02-20 16:03 and 'zumastor snapshot zumatest hourly' does not work either while a replication is going on 2008-02-20 16:04 this is what the log says in the origin: 2008-02-20 16:04 Thu Feb 21 01:02:41 CET 2008 /bin/zumastor[5827]: new snapshot will be '658' 2008-02-20 16:04 Thu Feb 21 01:02:41 CET 2008 /bin/zumastor[5827]: error: snapshot zumatest(658) not found 2008-02-20 16:04 error writing to /var/lib/zumastor/volumes/zumatest/targets/dubna/trigger 2008-02-20 16:04 Thu Feb 21 01:02:41 CET 2008 /bin/zumastor[5827]: dropping snapshot for zumatest(658) 2008-02-20 16:04 Thu Feb 21 01:02:41 2008: [29693] usecount: snapshot server is unable to set usecount for snapshot 658 2008-02-20 16:04 Thu Feb 21 01:02:41 2008: [29693] usecount: server reason for usecount failure: Snapshot tag 658 is not valid 2008-02-20 16:04 Thu Feb 21 01:02:41 CET 2008 /bin/zumastor[5827]: couldn't get usecount for vol 'zumatest' snap '658' 2008-02-20 16:05 oh joy, I think this is new: 2008-02-20 16:05 Thu Feb 21 01:03:59 CET 2008 /bin/zumastor[5827]: new snapshot will be '692' 2008-02-20 16:05 Thu Feb 21 01:03:59 CET 2008 /bin/zumastor[5827]: error: snapshot zumatest(692) not found 2008-02-20 16:05 (and many more like that in the origin) 2008-02-20 16:06 oh, of course: I'm beyong the 64 snapshots limit 2008-02-20 16:06 so you still got the 'Device or resource busy' error with 'dmsetup remove' 2008-02-20 16:07 and that causes the origin to hit the 64 limit? 2008-02-20 16:07 es 2008-02-20 16:07 yes 2008-02-20 16:07 [01:00] Thu Feb 21 00:59:35 CET 2008 /bin/zumastor[5827]: dropping snapshot for zumatest(578) 2008-02-20 16:07 [01:00] device-mapper: remove ioctl failed: Device or resource busy 2008-02-20 16:08 with the patch, you will still see the message 2008-02-20 16:08 but as long as you don't see the message for five times, the device should be removed 2008-02-20 16:09 oh 2008-02-20 16:09 do you see "remove device failed for zumatest(578)"? 2008-02-20 16:09 yes, but only once 2008-02-20 16:10 that means the patch is working, doesn't it? 2008-02-20 16:11 hmm, so you saw 'device-mapper: remove ioctl failed: Device or resource busy' once and"remove device failed for zumatest(578)" once? 2008-02-20 16:11 that should not happen 2008-02-20 16:12 if remove_device fails, you should see "remove ioctl failed" five times and "remove device failed for zumatest(578)" once 2008-02-20 16:13 this is what I see for example for 426: 2008-02-20 16:13 Thu Feb 21 00:50:49 CET 2008 /bin/zumastor[5827]: new snapshot will be '426' 2008-02-20 16:13 error writing to /var/lib/zumastor/volumes/zumatest/targets/dubna/trigger 2008-02-20 16:13 Thu Feb 21 00:50:49 CET 2008 /bin/zumastor[5827]: dropping snapshot for zumatest(426) 2008-02-20 16:13 device-mapper: remove ioctl failed: Device or resource busy 2008-02-20 16:13 Command failed 2008-02-20 16:13 Thu Feb 21 00:50:49 CET 2008 /bin/zumastor[5827]: remove device failed for zumatest(426) 2008-02-20 16:14 only one 'remove ioctl failed', then 'remove device failed' 2008-02-20 16:14 hmm, let me check the code again 2008-02-20 16:14 I'm replicating with a replication period of 5 seconds but that should allow for 4 failures 2008-02-20 16:21 hmm, it works for me 2008-02-20 16:21 mmm 2008-02-20 16:22 could you add in /lib/zumastor/common:remove_device 2008-02-20 16:22 how much data did you have in your volume? 2008-02-20 16:22 echo "remove_device try $i" in the for loop? 2008-02-20 16:22 4G 2008-02-20 16:23 ooooh 2008-02-20 16:23 wait 2008-02-20 16:23 I know what happened 2008-02-20 16:23 zumastor was already running when I applied the patch and I did not reload it 2008-02-20 16:23 my bad 2008-02-20 16:24 i c. you need to run /etc/init.d/zumastor restart 2008-02-20 16:24 done 2008-02-20 16:24 you may want to try shapor's quick fix for slow start/stop before doing that 2008-02-20 16:24 that is fast :) 2008-02-20 16:28 starting a 5G volume is fast anyway 2008-02-20 16:28 snapshots are not working yet 2008-02-20 16:28 it's not dropping snapshots 2008-02-20 16:29 and as I'm beyond 64, it's not taking new ones 2008-02-20 16:29 :-/ 2008-02-20 16:29 you can not take snapshots? 2008-02-20 16:29 exactly 2008-02-20 16:30 so you still got beyond 64 snapshots with the fix? 2008-02-20 16:32 no 2008-02-20 16:32 I got beyond 64 without the fix but the version with the fix is not able to go drop snapshots 2008-02-20 16:34 i c. I think there is some problem when we hit the 64 snapshot limit 2008-02-20 16:35 you can manually remove some snapshots with 'ddsnap delete' 2008-02-20 16:36 but it may be easier to start the test all over again 2008-02-20 16:40 I'm restarting again 2008-02-20 16:43 I think there is some problem in our ddsnap code to pick up a victim 2008-02-20 16:44 when we are at the 64 limit and there is a snapshot create request comes in, ddsnap automatically picks a victim to delete. 2008-02-20 16:45 either zumastor or ddsnap is failing at doing that, or I managed to confuse them enough :-) 2008-02-20 16:45 but when the next create request comes in again, our find_victim code selects the snapshot that we just created for deleting 2008-02-20 16:47 zumastor does not allow any configuration that exceeds the 64 snapshot limit. but because device mapper remove does not work, we rely on ddsnap auto delete 2008-02-20 16:51 it'd be nice if zumastor added this: http://www.furquim.org/chironfs 2008-02-20 16:51 automagically, I mean, right after a replica is defined 2008-02-20 16:52 hey jiayingz 2008-02-20 16:52 hi flips 2008-02-20 16:53 pgquiles: I'm not grokking what is different about that from say a clustered filesystem on top of ddraid. 2008-02-20 16:54 pgquiles: Oh I see, it is at the vfs layer. Well, that's kind of anti-thetical to zumastor's approach. 2008-02-20 16:54 flips, how about I post a draft for poster this week. any one can work on it if they want 2008-02-20 16:54 good idea 2008-02-20 16:54 jiayingz, see your query chat window? 2008-02-20 16:55 xman: I think it's perfectly complementary: you get redundancy on a volume by using the replicas as alternate source servers for that volume 2008-02-20 16:55 oh, he left 2008-02-20 16:56 pgquiles, zumastor is supposed to work with any file system 2008-02-20 16:57 a user can use chironfs on top of zumastor 2008-02-20 16:57 but i am not sure if zumastor wants to do that 2008-02-20 16:57 jiayingz: yes but I was trying to avoid having to setup chironfs (or any other DFS clone) :-) 2008-02-20 16:58 jiayingz: why not? 2008-02-20 17:00 pgquiles, that chironfs looks interesting 2008-02-20 17:00 pgquiles: Okay. I'll accept that it is complementary. 2008-02-20 17:00 will need to read more to know more 2008-02-20 17:00 pgquiles: xman was me. I was logged in 2x. 2008-02-20 17:00 xman == cbsmith 2008-02-20 17:00 once just didn't feel like enough 2008-02-20 17:01 I presume, chris == x 2008-02-20 17:01 as in christmas == xmas 2008-02-20 17:02 pgquiles, it is FUSE 2008-02-20 17:03 flips: That's pretty close to how it started. It was "Chris Smith" sounds like "Chrismas", so then I was "Xmas", which reduced to "X" or "Xman". 2008-02-20 17:03 not sure how its performance looks like 2008-02-20 17:03 jiayingz: All things FUSE *could* be done in kernel space with some effort. 2008-02-20 17:03 something interesting to look at 2008-02-20 17:04 I guess in some ways it is kind of like RAID-Z without the improved write efficiency. 2008-02-20 17:04 having chironfs on top of zumastor means defining the mirrors twice: first you define the replicas in zumastor, then in chironfs 2008-02-20 17:04 that was my poing 2008-02-20 17:04 point 2008-02-20 17:04 hmm, what is wrong with those pictures in its document 2008-02-20 17:07 pgquiles: It'd probably make more sense for the integration to be done on the chironfs side, but I see your point. 2008-02-20 17:08 pgquiles: Actually, you'd need a lot of changes in chironfs, because right now it writes to all the redundant block devices directly from the client when you do a write. 2008-02-20 17:08 fuse is a good way of checking out new filesystem ideas efficiently 2008-02-20 17:08 ACTION still likes the idea of just using a clustered filesystem on network block devices. 2008-02-20 17:08 I would rather work on checking out how to make block devices that do part of the work of the filesystem 2008-02-20 17:09 that's just my personal preference at the moment, not the One True Way[tm] by any means 2008-02-20 17:09 flips: Like basically make managing the namespace be pretty much the only job of the filesystem? 2008-02-20 17:09 pretty much, and security, and layout 2008-02-20 17:09 with more of the layout work maybe being done by the block device 2008-02-20 17:10 security is really managing the layout 2008-02-20 17:10 flips: Even layout could be done at the block device layer. Security would be a bit of an uncomfortable stretch, although perhaps a neat way to do MAC. 2008-02-20 17:10 err 2008-02-20 17:10 security is really managing the namespace I meant 2008-02-20 17:10 -!- Tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-20 17:10 flips: Or at least that is how it is normally done. 2008-02-20 17:11 true, security model needs to be split across block device and filesystem as well 2008-02-20 17:11 that is a known problem 2008-02-20 17:11 It'd be funky to have access tokens tied to the extent that was allocated for a file. So you literally can't get to the bytes for the file. 2008-02-20 17:11 somewhat in advance of where we are though 2008-02-20 17:11 flips: just a tad 2008-02-20 17:12 the word for it is "object storage device" 2008-02-20 17:12 mmm something failed here. Replication was working fine and once it wrote the last snapshot with data, the replica unmounted the volume 2008-02-20 17:12 though the official protocol for OSD is not necessarily sane 2008-02-20 17:12 pqguiles, that is a known deficiency of 0.5, 0.7 will correct it 2008-02-20 17:13 flips: no, I don't mean old snapshots are lost and only the last one is kept. I mean even the last one is missing. 2008-02-20 17:14 what does zumastor status --list say? 2008-02-20 17:14 on downstream 2008-02-20 17:15 # zumastor status zumatest --full 2008-02-20 17:15 /var/lib/zumastor/volumes/zumatest/device 2008-02-20 17:15 /var/lib/zumastor/volumes/zumatest/device/origin -> /dev/mapper/sysvg-test 2008-02-20 17:15 /var/lib/zumastor/volumes/zumatest/device/snapstore -> /dev/mapper/sysvg-test_snap 2008-02-20 17:15 /var/lib/zumastor/volumes/zumatest/filesystem 2008-02-20 17:15 /var/lib/zumastor/volumes/zumatest/source 2008-02-20 17:15 /var/lib/zumastor/volumes/zumatest/source/apply: 640 0/0 0 2008-02-20 17:15 /var/lib/zumastor/volumes/zumatest/source/hold: 640 2008-02-20 17:15 /var/lib/zumastor/volumes/zumatest/source/hostname: spectrum 2008-02-20 17:15 /var/lib/zumastor/volumes/zumatest/source/name: zumatest 2008-02-20 17:15 /var/lib/zumastor/volumes/zumatest/source/period: 5 2008-02-20 17:15 /var/lib/zumastor/volumes/zumatest/targets 2008-02-20 17:15 when I restarted zumastor, the volume was remounted but was empty, and now it has been unmounted again 2008-02-20 17:16 pgquiles, how about "zumastor status zumatest --list" ? 2008-02-20 17:17 oops I see a lot of "Thu Feb 21 02:14:04 2008: [14895] free_chunk: chunk 3d35d already free!" (with different chunk id's) in server.log downstream 2008-02-20 17:17 that is bad 2008-02-20 17:17 whee! 2008-02-20 17:17 flips: bin/zumastor status: unrecognized option `--list' 2008-02-20 17:17 :-) 2008-02-20 17:18 --list ? 2008-02-20 17:19 --usage you mean ? 2008-02-20 17:19 that sounds a serious bug 2008-02-20 17:20 indeed 2008-02-20 17:20 I'm going to start from scratch to check what happened 2008-02-20 17:20 what happened to status --llist? 2008-02-20 17:20 to list snapshots? 2008-02-20 17:20 yes 2008-02-20 17:20 you're thinking ddsnap status --list 2008-02-20 17:20 not zumastor 2008-02-20 17:20 fsck 2008-02-20 17:21 ddsnap 2008-02-20 17:21 ok, back to the problem at hand... 2008-02-20 17:22 ok, chunk already free 2008-02-20 17:22 badness 2008-02-20 17:22 check slashdot 2008-02-20 17:22 pgquiles, could you post the log to the mailing list? 2008-02-20 17:23 "Google Funds Work for Photoshop on Linux" 2008-02-20 17:23 *someone* got slashdotted 2008-02-20 17:23 dkegel, you are slashdotted 2008-02-20 17:24 there's an "i'd hit it" joke in there somehwhere :-P 2008-02-20 17:24 flips: sure. What logs do you want? everything upstream and downstream? 2008-02-20 17:25 just downstream should do 2008-02-20 17:25 we need to work back to the source of the "already free" 2008-02-20 17:26 shapor, we have more than one log, right? 2008-02-20 17:28 server.log, source.log and delta.log 2008-02-20 17:28 sent 2008-02-20 17:28 I sent everything from downstream 2008-02-20 17:29 See when I mention Dan gets on slashdot, noone notices. :-) 2008-02-20 17:29 shall we speculate about what it might be? 2008-02-20 17:29 ACTION will speculate after he picks up his bike. :-) 2008-02-20 17:29 bbiab 2008-02-20 17:30 happened on restart => possibiliity that it was a journal replay issue. However. Journal should not have needed to be replayed. 2008-02-20 17:30 flips: no, it happened before restart 2008-02-20 17:30 you restarted because something funny was happening? 2008-02-20 17:30 flips: actually I did restart to see if restarting would remount it 2008-02-20 17:31 flips: not exactly funny :-) 2008-02-20 17:31 these were the steps I performed when testing: 2008-02-20 17:31 we need to see the logs 2008-02-20 17:31 see if the snapshot store filled up for example 2008-02-20 17:32 I see your emailk 2008-02-20 17:32 ok 2008-02-20 17:32 email 2008-02-20 17:33 replication was working fine and I was seeing the data but after the last replication which transferred data downstream finished, the volume has been unmounted downstream 2008-02-20 17:33 hu Feb 21 02:14:04 2008: [14895] delete_snap: Delete snaptag 609 (snapnum 1) 2008-02-20 17:33 Thu Feb 21 02:14:04 2008: [14895] free_chunk: chunk 3e337 already free! 2008-02-20 17:33 then I restarted and the volume was mounted for a moment, but got unmounted inmmediately 2008-02-20 17:35 geez, dkegel made it all the way to the cgsociety forums 2008-02-20 17:36 i spend too much time on teh internets 2008-02-20 17:38 flips, i saw journal_replay in server.log 2008-02-20 17:38 after that, free_chunk messages appear 2008-02-20 17:42 the --zero patch seems to work, getting nice speeds doing the sequential zero during define 2008-02-20 17:42 time to check in my journal replay patch 2008-02-20 17:42 tomorrow 2008-02-20 17:42 willn, groovy 2008-02-20 17:44 need a background loop to provide some sort of a progress bar 2008-02-20 17:44 i'm kill -USR1 every once in a while for my own benefit 2008-02-20 17:44 I'm not crazy about hacking something like that into bin/zumastor though. 2008-02-20 17:45 willn: why not? 2008-02-20 17:45 It's ugly. Really ugly. 2008-02-20 17:45 willn: kill -USR1 is what dd tells you to do to see progress 2008-02-20 17:45 pgquiles, after a replication cycle finishes, zumastor umounts the old snapshot and mount the new snapshot on downstream. When you saw the snapshot was not mounted, that may be because mount took some time to finish 2008-02-20 17:45 pgquiles: Yea. 2008-02-20 17:46 Thu Feb 21 02:13:55 2008: [14895] daemonize: starting at Thu Feb 21 02:13:55 2008 2008-02-20 17:46 Thu Feb 21 02:13:56 2008: [14895] event_parse_options: invalid count in DDSNAP_COUNT 2008-02-20 17:46 Thu Feb 21 02:13:56 2008: [14895] snap_server: Received connection 2008-02-20 17:46 Thu Feb 21 02:13:56 2008: [14895] incoming: Activating server 2008-02-20 17:46 Thu Feb 21 02:13:56 2008: [14895] incoming: Server was not shut down properly 2008-02-20 17:46 Thu Feb 21 02:13:56 2008: [14895] replay_journal: Replaying journal 2008-02-20 17:47 pgquiles, how large are your /etc/blk.tab and /etc/blk.tab.bak? 2008-02-20 17:47 if they are quite large, mount takes a while to finish 2008-02-20 17:48 jiayingz: three entries 2008-02-20 17:48 jiayingz: I stopped zumastor a few minutes ago and started it now and now the volume is mounted downstream 2008-02-20 17:48 so you are probably right but even then, something is/was wrong 2008-02-20 17:49 because several minutes had elapsed since the delta was transferred and the snapshot had not been mounted 2008-02-20 17:49 but now it's taken only a second to mount it 2008-02-20 17:51 tomorrow I will apply shapor's fix to quickly mount/umount and repeat the test 2008-02-20 17:51 now I have a philophosical doubt: it's almost 3AM and I'm still at the office. Should I go home and sleep or stay here? heh... 2008-02-20 17:51 i saw the problem 2008-02-20 17:51 there is definitely a problem there somewhere 2008-02-20 17:52 zumastor recovered, good, but we need to track down the issue 2008-02-20 17:52 looks like the replication cycle of snapshot 640 did not finish 2008-02-20 17:52 I see we can do a much better job of logging server events 2008-02-20 17:52 there are a lot of "client connected"s without saying what the client did 2008-02-20 17:53 pqguiles, you should go home and sleep, no question 2008-02-20 17:53 logging the commands as they are executed would also be helpful, i. e. "zumastor snapshot hourly", "zumastor start source", etc 2008-02-20 17:53 yes 2008-02-20 17:53 the log is a bit of a mess 2008-02-20 17:53 easily cleaned up 2008-02-20 17:53 we should interleave all the lotgs 2008-02-20 17:53 logs 2008-02-20 17:54 not have lots of separate ones 2008-02-20 17:54 ok, see you tomorrow 2008-02-20 17:54 bye 2008-02-20 17:54 thank you for your patience :-) 2008-02-20 17:54 thanks again 2008-02-20 17:55 for your work 2008-02-20 17:55 thanks a lot for your tests and bug reports! 2008-02-20 17:55 shapor, ping? 2008-02-20 17:57 by the way, perhaps we should expose ddsnap status --list as a zumastor command 2008-02-20 17:58 it would be nice to have a way of saying "give me nothing but the snapshot numbers" 2008-02-20 17:58 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-20 17:58 the think is, zumastor takes care of the details of knowning the server socket for the command 2008-02-20 17:59 is there any shell command to translate signal numbers to signal names/text? 2008-02-20 18:01 flips: You know, that's a good question. You'd think strsignal would have a cli wrapper somewhere. 2008-02-20 18:02 flips: kill -l 2008-02-20 18:03 ACTION does "alias strsignal='kill -l'" 2008-02-20 18:03 nice, now how about translate signal name to descriptive text? 2008-02-20 18:03 flips: What about the name isn't descriptive? ;-) 2008-02-20 18:04 flips: so, the best I have for that is "man 7 signal" 2008-02-20 18:04 kill -L looks like it should work 2008-02-20 18:04 but it doesn't 2008-02-20 18:05 flips: Try /bin/kill -L 2008-02-20 18:05 it works but is lame 2008-02-20 18:05 It turns out it doesn't help much though. 2008-02-20 18:05 no description 2008-02-20 18:05 well 2008-02-20 18:05 man 7 signal | grep is probably the best one can hope for 2008-02-20 18:05 somebody can add a new option to kill then 2008-02-20 18:05 seems to be the way to do it 2008-02-20 18:06 kill -L 2008-02-20 18:06 flips: kill isn't as useful a place to put it as you'd think, because shells use their own built in kills most of the time. 2008-02-20 18:06 gives a long listing of the signal including a description at a parsable position? 2008-02-20 18:06 /bin/kill works for me 2008-02-20 18:07 anyway, status quo is lame 2008-02-20 18:07 Yeah, but then you have to type in /bin/... that's like 5 more characters! 2008-02-20 18:07 it's about having the script be able to write descriptions into the logs instead of signal numbers 2008-02-20 18:08 flips: ah 2008-02-20 18:09 flips: The output of `kill -l ` is still better than number. 2008-02-20 18:09 a bash patch to invoke the real command when it is too stupid to know the options would be nice 2008-02-20 18:09 flips: yes *that* ought to be done 2008-02-20 18:09 yes to "better than number" 2008-02-20 18:10 want to put in the issue? 2008-02-20 18:10 Sure. Why not. 2008-02-20 18:10 I think we can always count on kill being in /bin 2008-02-20 18:10 flips: yes, you can 2008-02-20 18:11 flips: I believe it is in lsb 2008-02-20 18:11 flips: And it is there on Gentoo. If it wouldn't be there on any platform, it'd be Gentoo. 2008-02-20 18:12 flips: Where are we logging the signal number? 2008-02-20 18:12 probably 2008-02-20 18:13 I fgrep'd "signal" in zumastor and didn't see it. 2008-02-20 18:14 or are we doing this logging in ddsnap? 2008-02-20 18:15 'cause I see lots of stuff in ddsnap where we log the signal number, in which case we should just use strsignal instead of spawning a process to get a string. 2008-02-20 18:21 Oy. Building our packages for amd64 is a pita. 2008-02-20 18:21 [With our build scripts] 2008-02-20 18:21 willn: I do it all the time. 2008-02-20 18:21 willn: Oh... not with the build scripts 2008-02-20 18:21 ACTION uses the almighty make 2008-02-20 18:21 debs? 2008-02-20 18:22 I have observed though that we have reintroduced some more 64-bit portability bugs that I was going to clean up. 2008-02-20 18:22 I just built our kernel, but I did the build on a hardy box 2008-02-20 18:22 so it does not function on gutsy 2008-02-20 18:23 (due to libc version differences) 2008-02-20 18:23 Ah. I always build my kernels manually. 2008-02-20 18:23 Our deb build process currently sucks, so i'm trying to track that down 2008-02-20 18:26 willn: Yes. I've been unable to determine how much of that ugliness is just deb being deb, and how much is our own doing. 2008-02-20 18:26 Its about 3/4 us, 1/4 them, afaict 2008-02-20 18:26 jiaying asks "what size should a poster be?" 2008-02-20 18:26 I suppose the answer is "poster sized" 2008-02-20 18:26 now who has a poster 2008-02-20 18:27 18x24 is the one I remember. 2008-02-20 18:27 I think there are actually multiple post dimensions 2008-02-20 18:27 s/post/poster/ 2008-02-20 18:27 ACTION checks 2008-02-20 18:28 standard poster dimensions used like 18 x 24, 24 x 36, 36 x 48, 48 x 72 and 48 x 120 2008-02-20 18:28 random answer from google 2008-02-20 18:29 http://answers.yahoo.com/question/index?qid=20070627052048AAjk14n <= Standard Poster Dimensions? 2008-02-20 18:30 If it's on Yahoo Answers it must be true. 2008-02-20 18:30 tim_vimm, who has a big printer to print a full size poster? 2008-02-20 18:30 Yeah, I always thought of 11x17 as tabloid dimensions. 2008-02-20 18:31 eensy weensy 2008-02-20 18:31 uh, there's one I used on santa monica east of the 405 2008-02-20 18:31 let me poke around 2008-02-20 18:31 It's hard to find printers that can handle anything bigger than 11x17 2008-02-20 18:31 Unless you mean "printers" the business instead of "printers" the machine. 2008-02-20 18:31 http://www.poster-printer.com/ 2008-02-20 18:32 http://www.digitalroom.com/ 2008-02-20 18:32 http://maps.google.com/maps?q=poster+printer&ie=UTF-8&oe=utf-8&rls=org.debian:en-US:unofficial&client=iceweasel-a&um=1&sa=N&tab=wl 2008-02-20 18:32 Colby Poster Printing Co 2008-02-20 18:33 too far away 2008-02-20 18:34 and they don't actually say they have a printer 2008-02-20 18:34 probably should just be looking for "instant printer" 2008-02-20 18:34 we've got a poster printer in the office 2008-02-20 18:34 kinkos does a good job actually 2008-02-20 18:34 oh good 2008-02-20 18:34 well that solves the problem 2008-02-20 18:34 willn, so what size does it print? 2008-02-20 18:34 uh 2008-02-20 18:34 hangon 2008-02-20 18:39 C size 2008-02-20 18:39 so kinkos it is... 2008-02-20 18:39 24x36 is pretty standard 2008-02-20 18:40 36x48 is huge 2008-02-20 18:41 ACTION is talking up Zumastor on OCLUG 2008-02-20 18:43 24x36 sounds like the one 2008-02-20 18:44 google maps is lame... I type in "poster printing" then hit "maps" and it shows me the entire us 2008-02-20 18:44 in spite of the fact that I has set my home location in santa monica 2008-02-20 18:44 so... I wonder if we can do something about that ;) 2008-02-20 18:58 ok 2008-02-20 18:58 box is online 2008-02-20 18:58 18T volume, 1T snapstore 2008-02-20 18:59 i'd say we should replicate it somewhere, but.... 2008-02-20 19:08 18T :) 2008-02-20 19:08 sun box? 2008-02-20 19:08 sounds like 24T raw 2008-02-20 19:08 48 spindles 2008-02-20 19:08 RAID 5 ? 2008-02-20 19:09 sun doing something useful for us, w00t 2008-02-20 19:09 shapor: Hopefully RAID-6 or some combination of RAID-0 and RAID-5. I can't imagine what it'd be like to only need 2 in 48 drives fail to lose everything. 2008-02-20 19:09 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-20 19:10 could be configured as 6 raid0 sets of 6+2 raid 6 2008-02-20 19:10 did that make sense? 2008-02-20 19:10 ACTION tries to parse 2008-02-20 19:11 here's a lunar eclipse going on now 2008-02-20 19:11 yes 2008-02-20 19:11 ACTION writes it differently 2008-02-20 19:11 6X(6+2 rad6) 2008-02-20 19:11 ah right 2008-02-20 19:11 need some darkness 2008-02-20 19:11 flips: okay, I parsed it. 2008-02-20 19:11 flips: the second one was easier, even with "RAD" :-) 2008-02-20 19:12 cbsmith: its 2xraid 6 + 2 hotspares shared 2008-02-20 19:12 rad6 is new :) 2008-02-20 19:12 flips: I suspect 7+1 RAID 5 striped 6 ways would be safe enough. 2008-02-20 19:12 way until I reveal raid 2+2, it is awesome 2008-02-20 19:12 flips: hehe 2008-02-20 19:13 cbsmith, I don't agree it is safe 2008-02-20 19:13 it is either unsafe or excessively redundant 2008-02-20 19:13 OCLUG had a "interesting" thread on RAID today, with various wacky RAID concoctions being discussed. 2008-02-20 19:14 flips: Okay, then I'd say the RAID6 + RAID-0 config you had was perhaps excessively redundant. You'd actually have a good chance of the system still being up with 4 simultaneous drive failures. 2008-02-20 19:14 what is redundant about raid0? 2008-02-20 19:15 As our configuration stands, we could lose 5 drives before data failure 2008-02-20 19:15 flips: You mean RAID-0 + 1 kind of config? That's pretty overly redundant too, but zippy to say the least. 2008-02-20 19:15 willn: What is the config? 2008-02-20 19:15 8 drives if they all failed distributed across 2008-02-20 19:15 2 groups of 23 drives in raid 6, plus 2 hot spares 2008-02-20 19:15 cmbsmith, no, I meant raid6 * raid0 2008-02-20 19:16 I didn't leave any room for hot spares 2008-02-20 19:16 probalby should 2008-02-20 19:16 Yeah. 2008-02-20 19:16 willn: the two RAID's are striped together? 2008-02-20 19:16 If you don't have hot spares your not playing the game 2008-02-20 19:16 cbsmith: lvm as icing on top 2008-02-20 19:16 flips: I wasn't thinking of hot spares either. No need. Takes away the fun. ;-) 2008-02-20 19:16 willn: Using lvm to do the striping? 2008-02-20 19:16 willn: funky 2008-02-20 19:17 (5 * raid6(7+2) + 2 * spare) 2008-02-20 19:17 whoops 2008-02-20 19:17 (5 * raid6(7+2) + 3 * spare) 2008-02-20 19:17 gross number of spares 2008-02-20 19:17 I like willn's config better 2008-02-20 19:17 well 2008-02-20 19:17 hot spares when you have raid 6 is kind of silly 2008-02-20 19:17 otherwise we'd do lvm(raid0(raid6(23 drives), raid6(23 drives))) 2008-02-20 19:17 flips: no. 2008-02-20 19:17 example? 2008-02-20 19:18 If your aiming to be enterprisey, you'd better err on the side of caution 2008-02-20 19:18 willn: Wait, that's what I thought you did. 2008-02-20 19:18 raid6 is erring on the side of caution 2008-02-20 19:18 erring on the side of expensive is also bad 2008-02-20 19:18 flips: I've seen certain enterprise storage products using raid6-work-alikes have multiple drive failures and end up tapping spares to remain hot 2008-02-20 19:18 flips: raid6 with 23 drives isn't *that* cautious. Keep in mind that during rebuilds you have this nasty tendency to cause other drives to fail. 2008-02-20 19:19 flips: expensive? hardly. 2008-02-20 19:19 cbsmith, I didn't say raid6 with 23 drives 2008-02-20 19:19 19T usable out of 24 theoretical is pretty darn good 2008-02-20 19:19 I said (6 * raid6(6+2)) 2008-02-20 19:19 yes 2008-02-20 19:20 but it is raid5 or what? 2008-02-20 19:20 what? 2008-02-20 19:20 what is the configuration? 2008-02-20 19:20 lvm(raid6(23 drives), raid6(23 drives)) 2008-02-20 19:20 sorry, missed that 2008-02-20 19:20 theres a shared 2 drive spares pool 2008-02-20 19:21 yes, nice 2008-02-20 19:21 very nice 2008-02-20 19:22 ok, so the hot spares mean that the box can be brought back to full performance remotely 2008-02-20 19:22 is that the entire benefit? 2008-02-20 19:23 if the array degrades, the hot spare will join the hot array pool and keep functioning in a non-degraded state until you replace the broken drive 2008-02-20 19:23 otherwise you risk running degraded for a while 2008-02-20 19:23 what I said indeed 2008-02-20 19:24 its a big benefit. 2008-02-20 19:24 oh yes 2008-02-20 19:24 it would be nice to see cpu number 2008-02-20 19:24 s 2008-02-20 19:24 also helps when your hardware techs accidentally 'replace' the wrong drive 2008-02-20 19:24 cpu numbers 2008-02-20 19:24 for degraded mode 2008-02-20 19:24 heh 2008-02-20 19:25 its ok, the sata_mv driver does not support hot swap, iirc 2008-02-20 19:25 or status checking 2008-02-20 19:26 is there a software way to take one disk offline and see how much cpu the reconstruction requires? 2008-02-20 19:26 Yea, we can do that later. 2008-02-20 19:26 right now we're still rebuilding 2008-02-20 19:27 the box seems happier with this driver, not complaining about soft lockups or getting horribly overloaded (yet) 2008-02-20 19:27 but we are still missing features 2008-02-20 19:31 lunar ecplise is pretty spectacular 2008-02-20 19:32 particularly with a little magnification 2008-02-20 19:32 which drive is it? 2008-02-20 19:32 which driver is it? 2008-02-20 19:48 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-20 19:51 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-20 19:52 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has left #zumastor 2008-02-20 20:00 flips: yeah, looking at it through a 300mm lens 2008-02-20 20:00 leet 2008-02-20 20:00 the last one i got to look through a pretty nice telescope 2008-02-20 20:00 it more than filled the field of view 2008-02-20 20:00 that was many years ago 2008-02-20 20:03 jiayingz, what is the subject line of your patch with the new error handler in it? 2008-02-20 20:03 or shapor 2008-02-20 20:04 damm, I have failed to review way too many patches 2008-02-20 20:07 you should flag patch messages in your mua 2008-02-20 20:09 flips: "[patch] print error messages from ddsnapd in ddsnap" 2008-02-20 20:09 i repsonded to that with a patch back on 12/17 2008-02-20 20:09 you did ack it 2008-02-20 20:09 but it had a bad bug in it 2008-02-20 20:10 which i knew when i posted it ;) 2008-02-20 20:11 http://groups.google.com/group/zumastor/browse_thread/thread/75e42fe0f1408416/5914d1fca082b70b?#5914d1fca082b70b 2008-02-20 20:14 flips: sata_mv instead of the marvell mv_sata 2008-02-20 20:19 er 2008-02-20 20:19 "XFS: Filesystem dm-3 has duplicate UUID - can't mount" 2008-02-20 20:19 rut ro 2008-02-20 20:21 uuid braindamage 2008-02-20 20:21 how to just say "don't do that sh*t" 2008-02-20 20:21 moreover 2008-02-20 20:21 how do we fix it 2008-02-20 20:21 so zumastor isnt broken 2008-02-20 20:21 apt-get remove something? 2008-02-20 20:22 uh 2008-02-20 20:22 well 2008-02-20 20:22 the full explanation is, you describe to me in detail what failed and I make ddsetup not do that 2008-02-20 20:22 after zumastor snapshot hourly 2008-02-20 20:22 I am looking with very squinty eyes at the entire notion of uuid in dm 2008-02-20 20:22 it fails to mount the snapshot volume 2008-02-20 20:22 I see 2008-02-20 20:22 willn: we documented this problem 2008-02-20 20:22 because of a duplicate uuid 2008-02-20 20:22 shapor: where? 2008-02-20 20:23 you have to pass nouuid option 2008-02-20 20:23 its not mentioned in the howto 2008-02-20 20:23 its the reason we added mount options to zumastor 2008-02-20 20:23 willn: somewhere :) 2008-02-20 20:23 well, its useless if we can't find it 2008-02-20 20:23 uuid support is really msconceived in dm 2008-02-20 20:23 where we is not the author 2008-02-20 20:23 I have trouble believing it's any better conceived in udev etc 2008-02-20 20:24 willn, I think "just grit your teeth and do it", and I make ddsetup default to having no uuid braindamage 2008-02-20 20:25 so you can enable it via --uuiddamage or something 2008-02-20 20:25 heh 2008-02-20 20:25 I think uuids are a net plus for a simple setup, but become troublesome in whack environments like ours 2008-02-20 20:26 there is just no fscking way uuidds should be implemented in kernel as dm does 2008-02-20 20:26 The 'random scattered config files' thing is an issue. Do we document a way to modify volume settings? 2008-02-20 20:26 [eg, adding this nouuid option] 2008-02-20 20:26 [besides echo nouuid > /var/long/stupid/path ] 2008-02-20 20:27 we already have a --mountopts 2008-02-20 20:27 never touch the filesystem database directly 2008-02-20 20:27 shapor: If you have already defined your volume, now what? 2008-02-20 20:28 redefine is supposed to work 2008-02-20 20:28 i think someone broke that at some point though ;) 2008-02-20 20:28 -_- 2008-02-20 20:30 wow it was over a year ago we discussed the nouuid option for zumastor 2008-02-20 20:30 willn: see the xfs cbtb test 2008-02-20 20:30 if you want an example 2008-02-20 20:30 are you making it one big filesystem? 2008-02-20 20:31 Right now, yea 2008-02-20 20:31 heh 2008-02-20 20:35 shapor, do you remember where jiaying's error handling patch is? 2008-02-20 20:37 ok, it's "Re: two patches to be reviewed for supporting origin/snapshot resizing" 2008-02-20 20:38 and I already grabbed it before, shows how stuff can fall out of cache 2008-02-20 20:41 hm i just read about saturn being visible next to the eclipse 2008-02-20 20:41 wish i had a telescope to see the rings 2008-02-20 20:42 I wish I had a spaceship to go up there 2008-02-20 20:42 you dont? 2008-02-20 20:42 i'd let you borrow mine but its in the shop 2008-02-20 20:43 got towed last night 2008-02-20 20:43 damn spacecops 2008-02-20 20:56 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor irc.oftc.net #zumastor log beginning Thu Feb 21 00:00:01 PST 2008 2008-02-21 00:54 -!- shapor(~shapor@yzf.shapor.com) has joined #zumastor 2008-02-21 00:54 -!- willn(~wan@pinball.ccs.neu.edu) has joined #zumastor 2008-02-21 00:55 -!- ChanServ changed topic to "http://www.zumastor.org" 2008-02-21 01:48 apt-get install lvm2 just hangs in install scripts on my system here, then can't be isntalled or removed by any obvious means, breaking apt-get 2008-02-21 01:49 -!- erwan_taf(~erwan@LAubervilliers-151-13-63-69.w217-128.abo.wanadoo.fr) has joined #zumastor 2008-02-21 01:49 likely, butchery on the state file is required to get my system back 2008-02-21 01:49 (attempting to back out an lvm2 install) 2008-02-21 01:51 ok, dpkg --remove lvm-common fixed it 2008-02-21 01:52 what distro? 2008-02-21 01:52 debian sid 2008-02-21 01:54 http://www.eeeuser.com/2008/01/21/further-details-on-the-eee-touchscreen/ 9 inch eee with touchscreen 2008-02-21 02:43 willn: what distribution are you installing in the sun with the 19T volume? if it's ubuntu > dapper, bug 107326 will hit you: https://bugs.edge.launchpad.net/ubuntu/+source/parted/+bug/107326 2008-02-21 02:58 its gutsy iirc 2008-02-21 02:59 if its only an installer problem it might not effect him 2008-02-21 02:59 he installed the os on a 4gb cf card 2008-02-21 03:06 shapor: the problem is the partition will be trimmed down to 2GB and all data lost on reboot 2008-02-21 03:07 shapor: it's not just an installer problem, it's a "fix" ubuntu added in feisty and breaks partitions on everything but macintosh 2008-02-21 03:07 s/breaks/destroys 2008-02-21 07:17 pgquiles: we've not hit that bug 2008-02-21 07:20 willn: ok 2008-02-21 07:32 /dev/mapper/hugevol 18T 1.1M 18T 1% /var/run/zumastor/mount/hugevol 2008-02-21 07:32 Rock. 2008-02-21 07:38 Hmm 2008-02-21 07:38 it seems to have wedged itself on writes 2008-02-21 07:38 wrote ~1.5G via ram-cache 2008-02-21 07:38 now its not writing anything 2008-02-21 07:41 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-21 07:41 Yea, its basically useless. 2008-02-21 07:41 I assume there is some ddsnap issue with a volume of this size 2008-02-21 07:43 willn: I ran into problems with a 3TB volume a month or so ago 2008-02-21 07:43 ACTION hmms 2008-02-21 07:43 zumastor forget is wedged too 2008-02-21 07:43 there are some race conditions with ddsnap and zumastor yet 2008-02-21 07:45 testing large volumes is not worth the effort until you will be able to replicate a 5G volume with a replication period of 1 second 2008-02-21 07:45 5GB of data, I mean, not an empty volume 2008-02-21 07:46 but currently it fails to replicate 5G with 5 seconds period, so... 2008-02-21 07:46 bleh 2008-02-21 07:47 the math here is large volume with sensible replication period == small volume with crazy fast replication period 2008-02-21 08:01 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-21 08:06 this 2T vol seems ok 2008-02-21 08:07 replicating ok, you mean? 2008-02-21 08:07 -!- charlesnw(~charles@ses.siderean.com) has joined #zumastor 2008-02-21 08:08 no, running at all 2008-02-21 08:08 though, snapshots are horrid with xfs, due to the xfs_freeze process 2008-02-21 08:19 writing+snapshotting on xfs results in horkage 2008-02-21 09:15 willn: IIRC the 2TB destruction happened on rebooting, not on umounting/mounting, due to partition label rewrite 2008-02-21 09:16 have you rebooted the Sun machine? 2008-02-21 09:16 willn: if its wedged, get sysrq-t output 2008-02-21 09:59 pgquiles: yea 2008-02-21 09:59 shapor: Its wedged multiple times 2008-02-21 09:59 I killed ddnsapd last time with limited success 2008-02-21 10:02 open("/var/lib/zumastor/volumes/twoterabyte/master/trigger", O_WRONLY|O_CREAT|O_TRUNC, 0666 2008-02-21 10:02 ACTION waits 2008-02-21 10:05 which process is doing that? 2008-02-21 10:06 more importantly, what is the master doing ;) 2008-02-21 10:07 thats `zumastor snapshot ....` 2008-02-21 10:08 the master is doing a bunch of 2008-02-21 10:08 read(9, "\4\0\255\276\20\0\0\0", 8) = 8 2008-02-21 10:08 read(9, "s\0228\0\1\0\325\315@\2\0\0\0\0\1\0", 16) = 16 2008-02-21 10:08 write(9, "\5\0\255\276\20\0\0\0s\0228\0\1\0\325\315@\2\0\0\0\0\1"..., 24) = 24 2008-02-21 10:08 poll([{fd=5, events=POLLIN}, {fd=7, events=POLLIN}, {fd=6, events=POLLIN}, {fd=9, events=POLLIN, revents=POLLIN}], 2008-02-21 10:08 and 2008-02-21 10:08 pread(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 619338629120) = 16384 2008-02-21 10:08 pwrite(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 16384, 923582464) = 16384 2008-02-21 10:10 no "zumastor master" not "ddsnap server" 2008-02-21 10:10 mm 2008-02-21 10:11 it seems to be dead 2008-02-21 10:13 can you get sysrq-t output from the console? 2008-02-21 10:15 were you stracing ddsnap server when it locked up? 2008-02-21 10:15 no, i'm booting the machine, its a really reproducable issue 2008-02-21 10:15 thats a pretty reliable way to deadlock the system, because you're in the block-io path ;) 2008-02-21 10:16 link to daniel's bio throttling patch is broken 2008-02-21 10:16 tim_vimm: ah probably because the kernel version change 2008-02-21 10:16 i'll fix it 2008-02-21 10:20 tim_vimm: thanks, it was all the patches... fixed now 2008-02-21 10:20 tx 2008-02-21 10:24 shapor: to kill the box, start writing a bunch of data to the volume, and take a snapshot 2008-02-21 10:25 did you get the traceback ? 2008-02-21 10:27 Nothing is terming, its just hanging 2008-02-21 10:29 dd is wedged writing data to the volume 2008-02-21 10:33 sysrq-t is kind of useless, its getting truncated 2008-02-21 10:36 (There are about 150 kernel processes) 2008-02-21 10:36 whys it getting truncated? 2008-02-21 10:37 its very hard to diagnose without any data ;) 2008-02-21 10:41 er, just took a while to fully populate kern.log 2008-02-21 10:42 dd was stuck in congestion_wait 2008-02-21 10:45 email full trace please :) 2008-02-21 10:55 willn: so it stopped making progress? 2008-02-21 10:55 Yea 2008-02-21 10:55 The machine is broken at the moment, lvdisplay and friends hang too. 2008-02-21 10:57 i think this is the first time we're testing on top of md 2008-02-21 10:58 looks suspicious 2008-02-21 10:58 ddsnap server call trace: 2008-02-21 10:58 [ 1089.694714] [__wake_up+67/112] __wake_up+0x43/0x70 2008-02-21 10:58 [ 1089.694721] [_end+127685439/2129822308] :raid456:unplug_slaves+0x6b/0xc0 2008-02-21 10:58 [ 1089.694729] [_end+127313479/2129822308] :dm_mod:dm_table_unplug_all+0x33/0x50 2008-02-21 10:58 [ 1089.694734] [io_schedule+40/64] io_schedule+0x28/0x40 2008-02-21 10:58 [ 1089.694739] [__blockdev_direct_IO+2673/3040] __blockdev_direct_IO+0xa71/0xbe0 2008-02-21 11:10 Well, that is not good 2008-02-21 13:34 -!- cbsmith(~xman@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-21 14:24 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-21 14:38 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-21 17:26 -!- erwan__taf(~erwan@visage.seanodes.com) has joined #zumastor 2008-02-21 17:34 -!- erwan__taf(~erwan@LAubervilliers-151-13-63-69.w217-128.abo.wanadoo.fr) has joined #zumastor 2008-02-21 17:36 -!- mitchg_(~mitchg@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-21 18:20 -!- natalie(~nataliep@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-21 23:44 -!- pgquiles(~pgquiles@62.43.226.52) has joined #zumastor irc.oftc.net #zumastor log beginning Fri Feb 22 00:00:01 PST 2008 2008-02-22 03:48 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-22 07:26 -!- charlesnw(~charles@ses.siderean.com) has joined #zumastor 2008-02-22 07:28 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-22 11:05 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-22 11:09 shapor: That patch does not seem to apply (changed old to old/block, new to new/block applying to 2.6.22.18) 2008-02-22 11:30 flipz: ^ 2008-02-22 11:30 willn, I'll be right over 2008-02-22 12:56 Gah. 2008-02-22 12:57 Why, oh why do our build scripts have a hard-coded i386 arch? 2008-02-22 13:19 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-22 13:36 -!- cbsmith(~xman@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-22 13:42 ok, there we go, dana is playing shadow of the collosus so I can get some work done 2008-02-22 13:43 ...because of dana not trying to type on my keyboard ;) 2008-02-22 13:43 willln, sowwy 2008-02-22 13:43 willn, because there exists naught besides? 2008-02-22 13:51 flips: It's silly. Since we are apposed to cross-compliation, we can just say 'build for what we are on' 2008-02-22 13:54 yes 2008-02-22 13:55 cross compilation is a slippery slope 2008-02-22 13:55 since we are not compiling zumastor for cellphones at the moment, we can build native 2008-02-22 13:55 well 2008-02-22 13:56 64 bit cross is often wanted 2008-02-22 13:56 but not so badly we need to put dev hours into it 2008-02-22 14:00 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-22 14:01 Well, I'd like to fix our (deb) package builder this weekend, because this is nonsense. 2008-02-22 14:02 We should be building sensible source packages, then building those on whatever arch can build source packages (including submitting to the PPA) 2008-02-22 14:49 sounds good 2008-02-22 14:50 jiayingz, ping? 2008-02-22 14:56 hi flips 2008-02-22 14:58 hi jiayingz 2008-02-22 14:59 just replying to the poster thread now 2008-02-22 14:59 no reason why we can't do this on the public list 2008-02-22 14:59 posting the poster to the public list? 2008-02-22 15:00 I think I might the grammar tweaks to the public list, just the text 2008-02-22 15:00 no need to spam our own post with big xml files 2008-02-22 15:00 our own list I meant 2008-02-22 15:01 jiayingz, do you see your query chat window? 2008-02-22 15:01 yes 2008-02-22 15:03 flips, do you think if there will be some problem when running zumastor on top of software raid? 2008-02-22 15:03 there is certainly a bug with the current patch set 2008-02-22 15:03 and a fix 2008-02-22 15:03 the fix is slowly working its way into our patches 2008-02-22 15:07 you think that is the bug in bio patch that causes hanging? 2008-02-22 15:07 yes 2008-02-22 15:42 ooffice has no obvious way to extract just the text from a slide. 2008-02-22 15:42 blows. seriously 2008-02-22 15:42 need help? 2008-02-22 15:42 unzipped odp is all one big amorphous line 2008-02-22 15:43 yes :) 2008-02-22 15:43 suggestion? 2008-02-22 15:43 hit me with what you have 2008-02-22 15:43 how about I just post the edited poster to the list 2008-02-22 15:43 2 secs 2008-02-22 15:45 ok, sent 2008-02-22 15:46 Sun should not be allowed to design anything 2008-02-22 15:47 save them from theselves 2008-02-22 15:47 hmm 2008-02-22 15:47 it unwrapped to another archive format 2008-02-22 15:47 it unzipped a zip 2008-02-22 15:47 tim, interested in pizza at goog? 2008-02-22 15:47 zippy on might say 2008-02-22 15:48 goog za! 2008-02-22 15:48 que hora? 2008-02-22 15:48 just grab ooffice and edit the odp I think 2008-02-22 15:48 we have to produce ppt 2008-02-22 15:48 let me try that now 2008-02-22 15:48 do you have text? 2008-02-22 15:48 I should come by.. 2008-02-22 15:49 spousal unit is out of town, so I'm open 2008-02-22 15:49 you there now, or do you need scramjet shoes? 2008-02-22 15:49 saved as ppt, ooffice did not even whine about it 2008-02-22 15:49 that's easy 2008-02-22 15:49 scramjet time 2008-02-22 15:50 right about now 2008-02-22 15:50 see you there? 2008-02-22 15:50 see you in 20 2008-02-22 15:50 actual 20 today 2008-02-22 15:50 make it 30 2008-02-22 15:50 as opposed to 40 yesterday 2008-02-22 15:50 ok 2008-02-22 15:50 ;-) 2008-02-22 15:50 me I meant 2008-02-22 15:50 it couldn't be me 2008-02-22 15:50 i'll be 5 minutes early to my funeral 2008-02-22 15:51 see the stuff I'm doing on lkml with dhowells re cachefs? 2008-02-22 15:51 or rather the stuff dhowells is doing with me being the peanut gallery 2008-02-22 15:51 ah, drama? I wasn't aware 2008-02-22 15:52 much less than drama 2008-02-22 15:52 hopefully something good comes out 2008-02-22 15:53 i've been reading the thread pretty closely 2008-02-22 15:53 he hasn't mentioned solid state yet 2008-02-22 16:13 -!- dkegel(~chatzilla@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-22 17:23 -!- jiayingz(~jiayingz@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-22 19:26 Posted latest test run 2008-02-22 19:27 summary: Something is causing is to flush to disk stupidly slow (<= 100KB/s) irc.oftc.net #zumastor log beginning Sat Feb 23 00:00:01 PST 2008 2008-02-23 01:58 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-23 07:20 -!- dank(~chatzilla@cpe-76-90-56-73.socal.res.rr.com) has joined #zumastor 2008-02-23 07:21 hi 2008-02-23 07:21 I'm finally trying out the howto's replication example under vmware 2008-02-23 07:21 ten minutes and counting. It's way slow. I guess I should start with a smaller example...? 2008-02-23 07:28 sure you didn't fail allready? 2008-02-23 07:29 oh, sorry somehow I was thingking xen. No fails with zumastor. 2008-02-23 07:45 dank: what size are your volumes and how much data have they got? 2008-02-23 07:46 the howto example (5G, only a small text file) replicates almost instantly for me 2008-02-23 07:46 (not tried in vmware, though) 2008-02-23 07:47 5g, only small text file 2008-02-23 07:47 what replication period? 2008-02-23 07:47 It took ten minutes of CPU time on one side, probably 5 on the other. 2008-02-23 07:47 60 seconds, just like in example. 2008-02-23 07:47 something's wrong, then 2008-02-23 07:48 did you zero the volume? 2008-02-23 07:48 Although the example seemed to give a period of 60 on one side and 600 on the other... I used 60 on both 2008-02-23 07:48 Er, um, I dunno. Not on purpose. But it was zero-ish when I originally started, since that's how I get the vmware image small. 2008-02-23 07:48 mmm that could be the reason 2008-02-23 07:49 was the disk already allocated or was vmware allocating it on demand? 2008-02-23 07:59 Already allocated. 2008-02-23 08:29 phun phact: if your current directory is the mount point, you won't see the new files. 2008-02-23 08:29 You have to cd out and back in. 2008-02-23 10:16 dank: the different periods are for pull and push 2008-02-23 10:16 the downstream one of the "nag interval" 2008-02-23 10:17 how often it "nags" for a push 2008-02-23 10:17 if upstream has stopped replicated for some reason 2008-02-23 10:17 i never really like that 2008-02-23 10:37 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-23 11:30 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-23 13:55 Seems redundant...? 2008-02-23 15:14 I tried specifying --period downstream only, and it never replicated. 2008-02-23 15:16 'course it would help if I did define source 2008-02-23 15:16 actually, I did. Hmm. 2008-02-23 15:21 I see from downstream's nag.log that it's sending lots of wakeups. 2008-02-23 15:37 hmm, and on upstream, in master.log, I see "error writing to .../trigger". Followed by device-mapper: remove ioctl failed: Device or resource busy 2008-02-23 15:56 Bleah. I'll send scripts to reproduce this to the list, I guess...? 2008-02-23 16:42 start source helps, too 2008-02-23 17:31 ok, updated HOWTO with the script (at the bottom of the "Try Remote Replication" section.) Works. Still waiting for Shapor's fix to bug 26, though. 2008-02-23 18:54 dank: what zumastor version are you using? the ioctl problem should be gone with jiayingz' patch a few days ago 2008-02-23 20:53 Ah, thanks. I'm running your build from Feb 9th. I guess I should try the zumastor.org builds, I haven't tried them in vmware lately. 2008-02-23 21:07 -!- dank(~chatzilla@cpe-76-90-56-73.socal.res.rr.com) has joined #zumastor irc.oftc.net #zumastor log beginning Sun Feb 24 00:00:01 PST 2008 2008-02-24 01:13 -!- nataliep(~nataliep@cpe-76-94-49-21.socal.res.rr.com) has joined #zumastor 2008-02-24 04:26 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-24 08:27 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-24 08:52 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-24 09:12 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-24 09:16 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-24 11:50 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has left #zumastor 2008-02-24 12:02 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-24 16:37 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has left #zumastor 2008-02-24 16:38 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-24 16:40 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-24 19:36 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-24 20:44 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-24 22:32 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor irc.oftc.net #zumastor log beginning Mon Feb 25 00:00:01 PST 2008 2008-02-25 00:49 -!- erwan_taf(~erwan@81.80.43.67) has joined #zumastor 2008-02-25 06:56 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-25 07:02 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-25 07:13 -!- charlesn1(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-25 07:33 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-25 07:34 -!- RomantiCo(~Lacoste@adsl-134-24-192-81.adsl.iam.net.ma) has joined #zumastor 2008-02-25 07:35 hola 2008-02-25 07:40 hola como est(as 2008-02-25 07:40 a todos 2008-02-25 08:39 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-25 08:50 -!- mitchg(~mitchg@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-25 09:21 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-25 10:23 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-25 11:02 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-25 11:13 -!- jiayingz(~jiayingz@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-25 13:04 flips: ping 2008-02-25 13:05 hi willn 2008-02-25 13:05 have you had a chance to poke at that sysreq output 2008-02-25 13:05 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-25 13:05 just now 2008-02-25 13:05 I will ping you in 15 or so 2008-02-25 13:06 ok 2008-02-25 13:23 willn, it is block io deadlock 2008-02-25 13:23 now why 2008-02-25 13:23 yes, please :) 2008-02-25 13:24 hmm, tasks blocking in mempool_free? that should not be 2008-02-25 13:24 xfs stuff 2008-02-25 13:24 xfs has found a way past our antideadlock measures methinks 2008-02-25 13:25 hmm 2008-02-25 13:25 I wonder if xfs effectively implements its own pflushd 2008-02-25 13:26 that would explain it 2008-02-25 13:26 willn, is this kernel compiled with frame pointers enabled? 2008-02-25 13:29 ddwrk blocked trying to get memory on a socket write, that task is in pf_memalloc mode, I do not see how that could be 2008-02-25 13:30 the box should not have been able to write to the log in that state 2008-02-25 13:34 the kernel is compiled with http://zumastor.googlecode.com/svn/trunk/kernel/config/2.6.22.18-i386-full 2008-02-25 13:34 rather, configured with that 2008-02-25 13:38 normally includes frame pointers 2008-02-25 13:38 xfs is doing some kinky stuff in sys_close 2008-02-25 13:39 CONFIG_FRAME_POINTER=y 2008-02-25 13:39 yea. 2008-02-25 13:39 root@zumathumper1:~# cat /proc/config.gz |gzip -d|grep POINTER 2008-02-25 13:39 CONFIG_FRAME_POINTER=y 2008-02-25 13:41 only two processes in d state, dd -> sys_close and ddsnap_server -> direct write 2008-02-25 13:41 this is the big clue 2008-02-25 13:41 willn, ok looks like a made-in-xfs issue 2008-02-25 13:41 any reason why we need xfs immediately? 2008-02-25 13:42 Well, ext3 blows for big volumes 2008-02-25 13:42 can you quantify that? 2008-02-25 13:45 mkfs.ext3: Filesystem too large. No more than 2**31-1 blocks (8TB using a blocksize of 4k) are currently supported. 2008-02-25 13:49 I thought ext3 did 2**32 now 2008-02-25 13:49 anyway, yes that is a hard limit 2008-02-25 13:50 we can accept an 8TB limit for the immediate future I think 2008-02-25 13:54 ext3 isn't so hot at handling large single files either 2008-02-25 14:04 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-25 14:07 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-25 14:07 -!- phillips_(~phillips@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-25 14:07 -!- erwan_taf(~erwan@81.80.43.67) has joined #zumastor 2008-02-25 14:07 -!- juuva_(juuva@peili.org) has joined #zumastor 2008-02-25 14:07 -!- flips(~phillips@phunq.net) has joined #zumastor 2008-02-25 14:07 -!- shapor(~shapor@yzf.shapor.com) has joined #zumastor 2008-02-25 14:07 -!- willn(~wan@pinball.ccs.neu.edu) has joined #zumastor 2008-02-25 14:07 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-25 14:09 willn, and ext3 deletes very slowly 2008-02-25 14:09 anyway, we now think the issue is md, not xfs 2008-02-25 14:21 Yep. 2008-02-25 14:23 flips, looks like the bio patch of 2.6.22.18 does not include the latest fix 2008-02-25 14:23 I am going to commit a patch to fix it 2008-02-25 14:23 that is ture 2008-02-25 14:23 thankyou 2008-02-25 14:24 np. I was hit by that bug when I tried to run mkfs.ext3 of a md device 2008-02-25 14:24 ACTION thinks we need an md cbtb test 2008-02-25 14:25 it's actually virtual device stacking in general that broke 2008-02-25 14:26 once fixed, it is highly unlikely to break 2008-02-25 14:26 if we had a special test for every feature in linux... 2008-02-25 14:26 now we have an md performance issue it seems 2008-02-25 15:03 -!- mitchg(~mitchg@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-25 15:33 -!- mitchg(~mitchg@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-25 16:37 -!- dkegel(~chatzilla@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-25 20:59 -!- CooB(~pupsigii@87.250.219.122) has joined #zumastor 2008-02-25 20:59 -!- CooB(~pupsigii@87.250.219.122) has left #zumastor 2008-02-25 22:42 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor irc.oftc.net #zumastor log beginning Tue Feb 26 00:00:01 PST 2008 2008-02-26 07:42 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-26 08:32 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-26 08:49 -!- dank(~chatzilla@cpe-76-90-56-73.socal.res.rr.com) has joined #zumastor 2008-02-26 08:49 cbsmith: Dan, Shapor, and Jiaying are all off at FAST. Want to get together at my place or Stir Crazy to talk/hack? 2008-02-26 08:57 dank: sounds great. I have to leave a bit late (Q is sick and mom is dropping off lesson plans at work before heading in), but Stir Crazy might be a nice change of scene. 2008-02-26 11:43 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-26 12:54 -!- cbsmith(~xman@adsl-63-193-154-252.dsl.lsan03.pacbell.net) has joined #zumastor 2008-02-26 13:54 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-26 17:29 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-26 18:01 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-26 21:46 -!- flipz(~phlipz@adsl-63-202-13-187.dsl.snfc21.pacbell.net) has joined #zumastor 2008-02-26 21:47 hey shapor 2008-02-26 22:39 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-26 22:54 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-26 23:03 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-26 23:29 flipz: pong 2008-02-26 23:29 hi shapor 2008-02-26 23:30 just about to go down for the night 2008-02-26 23:30 in the morning I need to pick up a usb key for my slides, forgot mine at home 2008-02-26 23:30 and see if kinko's will print the timothy version of jiaying's poster for not too much $$$ 2008-02-26 23:44 it won't be bad 2008-02-26 23:44 if you're printing the small one 2008-02-26 23:58 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor irc.oftc.net #zumastor log beginning Wed Feb 27 00:00:01 PST 2008 2008-02-27 02:08 -!- erwan_taf(~erwan@234-132.206-83.static-ip.oleane.fr) has joined #zumastor 2008-02-27 03:47 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-27 05:17 -!- pgquiles_(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-27 05:57 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-27 07:40 -!- erwan_taf(~erwan@234-132.206-83.static-ip.oleane.fr) has joined #zumastor 2008-02-27 07:46 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-27 09:11 good morning 2008-02-27 09:15 hi flipz 2008-02-27 09:16 ooh, EU fines microsfot $1.3 billion 2008-02-27 09:19 microsofties on slashdot are claiming EU will give the money to airbus to compete with boeing 2008-02-27 09:19 heh 2008-02-27 09:20 howdy 2008-02-27 09:22 morning tim 2008-02-27 09:22 you guy off to the conference? 2008-02-27 09:24 I'm off to get a usb key 2008-02-27 09:25 oh, the challenges we face... 2008-02-27 09:25 getting a usb key in silicon valley during rush hour... 2008-02-27 09:26 did my slides last night 2008-02-27 09:26 would love to see them when you're back 2008-02-27 09:26 nothing close to what you would do 2008-02-27 09:27 thanks for the compliment 2008-02-27 09:27 i've seen usb keys in hotel vending machines in silly valley.. great idea 2008-02-27 09:28 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-27 09:28 tim_vimm: anyone from violin going to fast? 2008-02-27 09:28 you'd think they'd have them in bins at the googleplex but no 2008-02-27 09:29 donpaul will be in town tomorrow, but I'm not sure if he's going 2008-02-27 09:29 I guess the problem is, the bins are all full of sushi 2008-02-27 09:30 we can meet up at the googleplex tomorrow or saturday 2008-02-27 09:53 ACTION blinks 2008-02-27 14:49 -!- cbsmith(~xman@207.47.98.129.static.nextweb.net) has joined #zumastor 2008-02-27 15:38 Hey FAST people, when is the poster session? 2008-02-27 15:42 willn: 5pm 2008-02-27 15:42 http://www.usenix.org/events/fast08/poster.html 2008-02-27 15:52 -!- sagemob(~sage@12.155.21.104) has joined #zumastor 2008-02-27 21:53 -!- sagemob(user@7.sub-75-209-134.myvzw.com) has joined #zumastor 2008-02-27 23:56 -!- flipz(~phlipz@adsl-63-202-13-187.dsl.snfc21.pacbell.net) has joined #zumastor irc.oftc.net #zumastor log beginning Thu Feb 28 00:00:01 PST 2008 2008-02-28 00:09 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-28 00:55 -!- erwan_taf(~erwan@81.80.43.67) has joined #zumastor 2008-02-28 06:42 ACTION blinks in BOS 2008-02-28 07:13 ACTION wonders what BOS means 2008-02-28 07:22 Boston, MA, USA 2008-02-28 07:23 pgquiles: re: 2008-02-28 07:23 oh :-) 2008-02-28 07:24 issue 85, i'm betting we probably also want results with non-ppa packages (Since the PPA packages are not qualified by our build system) 2008-02-28 07:24 what do you mean "qualified"? 2008-02-28 07:24 compiled/packaged/tested 2008-02-28 07:25 but, our qualified package postings are quite out of date 2008-02-28 07:43 oh, you mean the official packages 2008-02-28 07:43 willn: but the latest official packages were for 0.6, IIRC 2008-02-28 07:44 indeed, for 0.6r1318 2008-02-28 07:44 after shapor's patch for issue 71, the results of that r1318 binary are probably useless 2008-02-28 07:48 pgquiles: Snapshots 2008-02-28 07:51 willn: http://zumastor.org/downloads/snapshots/ ? 2008-02-28 07:51 that's even older 2008-02-28 08:02 Yea. We have newer stuff that has not been published 2008-02-28 08:10 oh 2008-02-28 08:36 willn: what do you think of moving my kernel-zumastor and kernel-xenzumastor packages to the zumastor-team PPA? 2008-02-28 08:53 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-28 08:58 pgquiles: I've been in the process of doing mostly that 2008-02-28 08:58 We basically want a couple upstream kernel packages with just the zumastor patches added on 2008-02-28 09:09 willn: you need to have all packages 2008-02-28 09:10 a -server package won't work on a -zumastor kernel 2008-02-28 09:10 linux-image, linux-backports-modules, linux-restricted-modules, linux-ubuntu-modules and linux-meta are needed 2008-02-28 09:10 xen-meta, too, for -xenzumastor 2008-02-28 09:24 -!- sagemob_(~sage@12.155.21.104) has joined #zumastor 2008-02-28 09:34 -!- sagemob_(~sage@12.155.21.104) has left #zumastor 2008-02-28 09:39 willn: what we can do is avoid building -server, -lpia, -generic, etc in zumastor's PPA but if we want -zumastor to become part of ubuntu 2008-02-28 09:39 we must use the same build chain 2008-02-28 10:26 ppa? 2008-02-28 10:40 flipz: Personal Package Archive 2008-02-28 10:41 flipz: https://help.launchpad.net/PPAQuickStart 2008-02-28 10:46 thanks 2008-02-28 10:55 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-28 11:50 -!- flipz(~phlipz@12.155.21.100) has joined #zumastor 2008-02-28 12:35 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has left #zumastor 2008-02-28 12:48 -!- cbsmith(~xman@adsl-99-165-23-73.dsl.lsan03.sbcglobal.net) has joined #zumastor 2008-02-28 13:41 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-28 13:53 -!- flipz(~phlipz@12.155.21.103) has joined #zumastor 2008-02-28 18:59 -!- charlesnw(~charles@63.139.86.9) has joined #zumastor 2008-02-28 22:45 -!- pgquiles(~pgquiles@81.202.65.108.dyn.user.ono.com) has joined #zumastor 2008-02-28 22:46 -!- flipz(~phlipz@adsl-63-202-13-187.dsl.snfc21.pacbell.net) has joined #zumastor 2008-02-28 22:46 getting frisbees from the microsoft recruiters was fun 2008-02-28 22:46 :) 2008-02-28 23:42 http://people.valinux.co.jp/~ryov/dm-ioband/ irc.oftc.net #zumastor log beginning Fri Feb 29 00:00:01 PST 2008 2008-02-29 00:39 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-29 00:50 -!- pgquiles(~pgquiles@62.43.226.52.static.user.ono.com) has joined #zumastor 2008-02-29 05:23 -!- emmy29(~emmy29@ANantes-257-1-78-233.w90-25.abo.wanadoo.fr) has joined #zumastor 2008-02-29 09:16 -!- charlesnw(~charles@cpe-75-84-92-80.socal.res.rr.com) has joined #zumastor 2008-02-29 09:51 -!- zumalog(~zumalog@yzf.shapor.com) has joined #zumastor 2008-02-29 09:52 -!- ChanServ changed mode/#zumastor -> -o flips 2008-02-29 09:53 -!- ChanServ changed topic to "http://www.zumastor.org" 2008-02-29 10:04 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-29 11:37 -!- pgquiles(~pgquiles@90.Red-83-34-134.dynamicIP.rima-tde.net) has joined #zumastor 2008-02-29 12:42 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-29 12:51 -!- shapor(~shapor@yzf.shapor.com) has joined #zumastor 2008-02-29 14:11 -!- tim_vimm(~Tim@cpe-76-90-128-140.socal.res.rr.com) has joined #zumastor 2008-02-29 16:22 -!- emmy29(~emmy29@ANantes-257-1-78-233.w90-25.abo.wanadoo.fr) has joined #zumastor 2008-02-29 23:07 ...back