Timestamps are in GMT/BST.
| 00:30:59 | raptor | ok back - sorry, i took my kid around the block on his two-wheeler |
| 00:31:08 | raptor | too much network with Circle? |
| 00:31:45 | raptor | so for Core-mines, would they be instant kill? have a time-limit? Be destroyable? |
| 00:32:30 | | Watusimoto Quit (Ping timeout: 252 seconds) |
| 00:42:02 | | raptor Quit () |
| 00:45:41 | | koda Quit (Quit: I used to be chatting like you. Then I took an arrow in the knee) |
| 02:52:54 | | Heyub Quit (Quit: KVIrc 4.0.4 Insomnia http://www.kvirc.net/) |
| 10:01:24 | | sam686 Quit (Ping timeout: 245 seconds) |
| 10:04:32 | | koda has joined |
| 12:03:49 | | koda Quit (Quit: koda) |
| 12:04:26 | | koda has joined |
| 12:11:30 | | LordDVG has joined |
| 13:18:29 | | Watusimoto has joined |
| 13:56:27 | | raptor has joined |
| 13:56:28 | | ChanServ sets mode +o raptor |
| 16:59:10 | | sam686 has joined |
| 16:59:11 | | ChanServ sets mode +v sam686 |
| 17:27:08 | | Watusimoto Quit (Ping timeout: 240 seconds) |
| 18:01:02 | | Watusimoto has joined |
| 18:56:43 | | Watusimoto Quit (Ping timeout: 246 seconds) |
| 18:56:59 | | Watusimoto has joined |
| 19:01:27 | Watusimoto | hi |
| 19:01:33 | Watusimoto | quick syntax question |
| 19:01:48 | Watusimoto | in current scripts, we can load a file with require: |
| 19:01:58 | Watusimoto | require("geometry") to load our geometry library |
| 19:02:06 | Watusimoto | that no longer will work |
| 19:02:09 | Watusimoto | so I |
| 19:02:15 | Watusimoto | 'm creating a new function |
| 19:02:37 | Watusimoto | the function can either load the file but not execute it, so scripts would write this: |
| 19:02:39 | Watusimoto | dofile(luaUtil:findFile("list")) |
| 19:02:50 | Watusimoto | or it can load and execute it, so we'd do this |
| 19:03:02 | Watusimoto | luaUtil:runFile("list") |
| 19:03:35 | Watusimoto | the only advantage to loading only is if... |
| 19:03:45 | Watusimoto | oh heck, I'm not sure why you'd want to load only |
| 19:04:20 | Watusimoto | forget it |
| 19:30:02 | raptor | hi |
| 19:30:10 | raptor | uhh... i'll adapt? |
| 19:48:40 | Watusimoto | I can now run at least one bot |
| 19:48:50 | Watusimoto | which is a huge step forward... |
| 19:48:53 | Watusimoto | now to try two |
| 19:48:56 | raptor | great! |
| 19:49:49 | Watusimoto | when i can run multiple bots, I'll check in so you can measure the memory usage of each, and see if all this work has improved anything |
| 19:50:45 | Watusimoto | ok, two bots works! |
| 19:50:48 | Watusimoto | in the same L! |
| 19:52:01 | Watusimoto | 5 bots work |
| 19:52:09 | Watusimoto | ok, checking in |
| 19:53:52 | raptor | did you check memory? |
| 19:53:54 | raptor | add 100 |
| 19:55:28 | raptor | or 50 |
| 20:00:36 | Watusimoto | how woudl I check it? via process monitor? |
| 20:00:45 | raptor | yep |
| 20:01:02 | raptor | if it goes from 20MB -> 120 MB (with 100 bots_ |
| 20:01:14 | raptor | or if it stays significantly less... |
| 20:01:20 | raptor | let me test on my machine the RAM... |
| 20:01:30 | Watusimoto | ah... freezes up a bit |
| 20:02:19 | raptor | ok, no bots: 16/22 (dedicated/shared memory) |
| 20:02:44 | Watusimoto | performance is NOT good |
| 20:02:58 | raptor | with 100 bots: 61/22 |
| 20:03:14 | sam686 | it will help to run with non-debug compile with optimizations... |
| 20:03:15 | Watusimoto | trying 017a with 100 bots |
| 20:04:27 | raptor | with 50 bots: 41/22 |
| 20:04:37 | raptor | maybe i should check virtual memory? |
| 20:04:52 | raptor | ah, haha |
| 20:04:53 | sam686 | while combining robots to one LUA might save memory, it doesn't really save any CPU usage - it still has to do a lot of calculations |
| 20:05:01 | Watusimoto | 51748K "private bytes" to 97,000K with 100 bots |
| 20:05:17 | Watusimoto | bots all died |
| 20:05:28 | Watusimoto | back to 55000K |
| 20:05:42 | Watusimoto | so that's the old way. I'll do a release build with the new way |
| 20:05:58 | raptor | good idea |
| 20:06:59 | Watusimoto | This new way cannot improve performance, only memory consumption. The best we can hope for is performance-neutral performance |
| 20:07:10 | raptor | yes |
| 20:07:24 | raptor | but that is still good, esp. for dedicated servers on limited RAM hosts |
| 20:07:26 | Watusimoto | I can do some real optimizations during instantiation, though |
| 20:07:44 | Watusimoto | well, I'm more concerned with CPU than ram |
| 20:07:56 | Watusimoto | but i can compile all the standard bot support code once and clone it |
| 20:08:01 | Watusimoto | with the new ay |
| 20:08:07 | sam686 | the smaller memory usage does help to prevent out of memory problems, mostly in some of limited RAM vps servers.. |
| 20:08:08 | Watusimoto | currently, we compile it for each bot |
| 20:10:43 | raptor | I remember adding zone-to-zone path caching to save CPU for each bot with 015a or 016... |
| 20:11:02 | raptor | we could probably move that out to a more global scope to save RAM further |
| 20:11:16 | raptor | maybe i alreayd made it static... |
| 20:12:05 | Watusimoto | crashed |
| 20:13:07 | | LordDVG Quit (Remote host closed the connection) |
| 20:13:54 | Watusimoto | ok |
| 20:14:01 | Watusimoto | from 49872K with no bots to... |
| 20:14:26 | raptor | the suspense! |
| 20:14:59 | Watusimoto | 70-75000K with 100 bots |
| 20:15:08 | Watusimoto | definite improvement there |
| 20:15:09 | raptor | great! |
| 20:15:14 | raptor | that's great! |
| 20:15:19 | Watusimoto | but the performance is total crap |
| 20:15:27 | raptor | worse than before? |
| 20:15:37 | Watusimoto | much |
| 20:15:40 | raptor | 100 bots is choppy even on my CPU |
| 20:15:58 | Watusimoto | ok, 73472K before (memory doesn't seem to be released, which is easily remedied) |
| 20:17:21 | Watusimoto | to 85000K with 100 bots? |
| 20:18:26 | Watusimoto | memory jumps around a lot |
| 20:18:38 | Watusimoto | which suggests lots of stuff being created, deleted, and garbage collected |
| 20:22:46 | Watusimoto | memory usage with old way is rock steady |
| 20:23:10 | Watusimoto | once bot is created, I guess it just sits there in memory |
| 20:24:18 | Watusimoto | 57->96->57,000K with 017a |
| 20:24:23 | Watusimoto | and 100 s_bots |
| 20:24:46 | Watusimoto | so new ay 25000K for 100 bots |
| 20:24:49 | Watusimoto | old way |
| 20:25:00 | Watusimoto | 40000K for 100 bots |
| 20:25:22 | raptor | hmm... |
| 20:25:27 | Watusimoto | I would expect the memory use would be lower for new way |
| 20:25:40 | raptor | it is a significant change, but i guess i was thinking an order of magnitude... |
| 20:25:53 | Watusimoto | and I would not exect to see memory cycling like that |
| 20:26:00 | Watusimoto | need to investigate a little further |
| 20:26:16 | Watusimoto | but perfornace of new way is, for the moment, unacceptable. |
| 20:34:44 | raptor | i ownder if we should profile bot performance specifically |
| 20:45:08 | Watusimoto | probably we should at some point |
| 20:45:30 | Watusimoto | we have some support for a lua profiler built in, though I've never used it |
| 20:48:41 | Watusimoto | I think I'm going to hold off on checking in until I feel better about where this is going |
| 20:55:53 | raptor | you mean if it is even worth it? |
| 20:56:44 | raptor | I thought for sure that memory would be reduced to a tenth at least... but maybe my premise was wrong in the first place |
| 20:59:52 | raptor | FYI: i submitted the bitfighter 017a update to linuxgames.com and happypenguin.org... it should show up soon sometime |
| 21:04:48 | Watusimoto | excellent. I sent an email a few days ago |
| 21:04:59 | raptor | to those same sites? |
| 21:05:05 | raptor | oh the list |
| 21:05:07 | raptor | yes |
| 21:05:08 | raptor | ok |
| 21:31:24 | Watusimoto | ok, on madcow, adding 100 bots and watching the FPS gives comparable results from 017a and new lua method |
| 21:31:31 | Watusimoto | we need a better way of measuring! |
| 21:33:06 | Watusimoto | there really should be no performance penalty to the new way... I think it only does a pointer copy or two extra work |
| 21:42:02 | raptor | so you fixeds something? |
| 21:51:31 | Watusimoto | no |
| 21:51:35 | sam686 | dumb questing, are you making sure the "ontick" is called only once per robot? Make sure not to call onTick (100 robots) 10000 times.. |
| 21:51:38 | Watusimoto | just hard to measure performance |
| 21:52:00 | raptor | ok, so wait... |
| 21:52:03 | Watusimoto | why would I call it 10k times? |
| 21:52:09 | Watusimoto | does that happen now? |
| 21:52:17 | raptor | so now you say performance is similar or still worse? |
| 21:52:31 | Watusimoto | I can' |
| 21:52:36 | Watusimoto | I can't really tell |
| 21:52:39 | sam686 | sending a onTick event 100 times, each event calls all 100 robot's onTick, thats a total of 10000 |
| 21:52:49 | Watusimoto | that happens in 017a? |
| 21:53:18 | sam686 | it happens earliear, maybe like between 016 and 015, but was fixed back then.. |
| 21:53:45 | Watusimoto | ok, then it should still be fixed |
| 21:53:50 | Watusimoto | that would be a disaster |
| 21:53:56 | Watusimoto | for performance |
| 21:53:59 | raptor | i remember that one |
| 21:54:10 | raptor | yeah, sam686's just double checking it's not happening again |
| 21:54:38 | Watusimoto | I hope not |
| 21:54:49 | sam686 | if the RAM usage of bitfighter is jumping up and down a bunch of times, that can eat CPU usage trying to allocate and free memory.. |
| 21:55:26 | Watusimoto | yes |
| 21:55:35 | Watusimoto | I can't figure out why it does that, thoguh |
| 21:56:00 | Watusimoto | should be smooth as glass |
| 21:56:22 | Watusimoto | I/m not creating/deleting objects to manage the bots |
| 21:57:38 | raptor | seaglass? |
| 21:59:40 | Watusimoto | On a 3 GHz P4 a lua_call() to an empty Lua function takes around |
| 21:59:40 | Watusimoto | 80ns. A lua_pcall() (which is what you probably want) or a |
| 21:59:40 | Watusimoto | lua_resume() takes around 130ns. Times are only slightly lower |
| 21:59:40 | Watusimoto | with LuaJIT because the overhead of calling _into_ the VM is |
| 21:59:40 | Watusimoto | pretty much constant. |
| 21:59:41 | Watusimoto | So you can make around 8 million calls per second into Lua (minus |
| 21:59:43 | Watusimoto | the time your app and the Lua functions take). Still worried? |
| 22:01:32 | raptor | is that copy pasta? |
| 22:07:32 | Watusimoto | yes |
| 22:07:50 | Watusimoto | just showing that there should be no problem with 100 bots |
| 22:08:15 | raptor | so our performance is due to our algorithms |
| 22:08:26 | Watusimoto | It would seem so |
| 22:08:47 | Watusimoto | I'm testing some things just to make sure I'm not recompiing the script every cycle or something stupid like that |
| 22:10:58 | Watusimoto | it's statements like this that worry me (from s_bot:) |
| 22:11:00 | Watusimoto | local bullets = bot:findItems(BulletType,AsteroidType,MineType) |
| 22:11:16 | Watusimoto | that creates a ton of objects when there are 100 bots shooting at one another |
| 22:11:20 | raptor | oh yikes |
| 22:11:24 | Watusimoto | yet... the bot needs this info |
| 22:12:43 | sam686 | http://sam686.maxhushahn.com/upload/profile_4321_81b6c9c27092.PNG That is with 40 robots |
| 22:13:28 | Watusimoto | going to try removing that line and see what it does to my performance |
| 22:13:57 | raptor | sam686: is all that allocation/free because of bullets? |
| 22:14:06 | Watusimoto | alloc |
| 22:14:27 | Watusimoto | maybe we need an object pool or something |
| 22:14:43 | raptor | i would think that the gridDB calls should be comparatively the most expensive... |
| 22:15:11 | Watusimoto | yes -- perhaps those could be cached each game cycle |
| 22:15:54 | sam686 | they are mostly coming from robots... |
| 22:17:56 | raptor | ooo |
| 22:17:57 | sam686 | bullets and all gameItems do not realloc, but vectors might realloc to make more room |
| 22:18:51 | sam686 | but most of realloc's backtrace is mostly coming from "luaM_realloc_" |
| 22:20:00 | raptor | cool, gcc allows you to compile in profiling support.. |
| 22:20:37 | Watusimoto | bah, removing the line I pasted earlier makes no dramatic difference |
| 22:25:10 | raptor | brb |
| 22:25:12 | | raptor Quit () |
| 22:26:11 | | raptor has joined |
| 22:26:12 | | ChanServ sets mode +o raptor |
| 22:26:55 | raptor | have good data: |
| 22:27:14 | raptor | http://sam686.maxhushahn.com/upload/profile_flat.txt |
| 22:27:41 | raptor | but it's a lot... |
| 22:27:50 | raptor | http://sam686.maxhushahn.com/upload/profile_callgraph.txt |
| 22:28:00 | raptor | the first is 1.2MB |
| 22:28:08 | raptor | the second 6.8 |
| 22:28:31 | raptor | that's running the new ctf level with 20 bots for about a minute |
| 22:36:39 | raptor | this is interesting, too: http://sam686.maxhushahn.com/upload/profile_annotated_robot_cpp.txt |
| 22:36:45 | sam686 | i would still wonder why realloc appears to be the most costly on windows, more then linux.... |
| 22:38:54 | raptor | i can get annotated, profiled code for any class you want to see... |
| 22:41:20 | raptor | maybe Lunar is the problem? |
| 22:42:55 | sam686 | C:\Program Files\Bitfighter\HG017\lua\lua-vec\src/lauxlib.c, l_alloc, lets see if i can change that function a bit, to use huge one megabyte array memory instead of lots of tiny memory, and my writing my own memory handling... |
| 22:43:30 | raptor | well, before you do something drastic, maybe Watusimoto is already working on something? |
| 22:47:08 | Watusimoto | no not really |
| 22:47:11 | Watusimoto | just poking around |
| 22:47:18 | Watusimoto | I got rid of some dynamic casts |
| 22:48:15 | Watusimoto | but I'm not quite convinced that lua is really creating tons of objects |
| 22:48:33 | Watusimoto | it's passing pointers to tons of objects, for sure |
| 22:48:44 | Watusimoto | but that might not create an allocation event |
| 22:48:49 | Watusimoto | those might be simple values |
| 22:48:54 | Watusimoto | that don't get allocated |
| 22:49:15 | raptor | well, Lunar::thunk is sure being called a lot |
| 22:50:22 | Watusimoto | this one seems expensive: |
| 22:50:23 | Watusimoto | calcInterceptCourse |
| 22:50:30 | Watusimoto | and gets called a lot |
| 22:51:47 | raptor | http://www.rasterbar.com/products/luabind.html |
| 22:52:03 | raptor | seems to be the most oft cited way to bind lua to c++ |
| 22:52:08 | raptor | luabind |
| 22:52:10 | Watusimoto | I think thunk is very cheap |
| 22:52:16 | raptor | is there a reason you chose Lunar? |
| 22:52:17 | Watusimoto | it's essentially a cast |
| 22:52:21 | Watusimoto | it was easy |
| 22:53:47 | raptor | thunk is cheap, but is called millions of times |
| 22:55:13 | Watusimoto | any binding system will need something similar, I think |
| 22:56:43 | Watusimoto | though maybe I'm wrong |
| 22:56:47 | Watusimoto | I don't know! |
| 22:58:31 | Watusimoto | canSeePoint gets called a ton too |
| 23:00:31 | sam686 | thats a whole lot of l_alloc being called a lot, 76725 times in just 30 second of only 2 robots... |
| 23:00:56 | sam686 | i put a printf in that function and found out |
| 23:03:30 | Watusimoto | ok, so what is calling that? |
| 23:03:50 | Watusimoto | I think it would be better to eliminate calls rather than try to optimize them |
| 23:05:25 | raptor | agreed |
| 23:06:12 | sam686 | whats worse is a bunch of "l_alloc" is allocating a size of only 32 or 16 bytes.. |
| 23:08:01 | sam686 | lua/lua-vec/src/lauxlib.c have the l_alloc, jsut put a breakpoint there or printf or whatever, and see if you can fix too many l_alloc |
| 23:08:52 | Watusimoto | a gazillion tiny allocs can't be good |
| 23:09:15 | raptor | in LuaRobot::doFindItems there is a lua_createtable |
| 23:09:24 | raptor | that method is being called the most, i think.. |
| 23:10:24 | sam686 | one of which is in LuaRobot::doFindItems .. lua_createtable .. luaH_new allocating 32 bytes... for every doFindItems... |
| 23:10:52 | raptor | ouch |
| 23:11:11 | Watusimoto | we do that a ton |
| 23:12:35 | Watusimoto | ah, we create a table to hold the items we found |
| 23:12:55 | Watusimoto | and probably discard it a moment later |
| 23:14:25 | Watusimoto | we could probably just push the found items onto the stack without a table |
| 23:14:44 | Watusimoto | the table is probably a real object create operation for lua |
| 23:14:53 | raptor | robot.cpp:682 what is taht clearStack(L) doing? |
| 23:15:07 | Watusimoto | the other stuff might not be... might just be pushing values around |
| 23:15:28 | Watusimoto | will need context -- my numbers aren't same as yours |
| 23:15:30 | raptor | which is being called roughly 7 times per idle tick |
| 23:15:34 | raptor | per bot |
| 23:16:02 | raptor | oh, it's in doFindItems right before lua_createtable |
| 23:16:10 | sam686 | really?? LuaShip::getCurrLoadout .. Lunar<Zap::LuaLoadout>::push .. <LuaRobot>pushuserdata.. lau_newuserdata .. realloc being called |
| 23:16:22 | sam686 | as in any push creates alloc it seems... |
| 23:16:47 | Watusimoto | clearStack is just getting rid of any junk on the stack |
| 23:17:03 | Watusimoto | we won't need it when we have better stack hygine |
| 23:17:13 | raptor | hehe |
| 23:17:22 | Watusimoto | I've started adding asserts all over the place that ensure teh stack is cleaned when we're done with it |
| 23:17:36 | Watusimoto | so that's |
| 23:17:37 | Watusimoto | 'free" |
| 23:17:43 | Watusimoto | we'd have to do that one way or another |
| 23:18:36 | Watusimoto | it's the create_table I'm suspicious of |
| 23:19:25 | Watusimoto | I'll try to rewrite without the table |
| 23:19:29 | Watusimoto | see if that helps |
| 23:19:35 | Watusimoto | will be messier, but perhaps faster |
| 23:20:14 | Watusimoto | I got rid of a dynamic cast in this fn, btw |
| 23:20:20 | raptor | yay |
| 23:23:30 | sam686 | looks like my simplest mine_bot doesn't do any l_malloc, and eliza doesn't do any malloc except then outputting text |
| 23:36:11 | Watusimoto | I'm gong to add 100 elizabots and see what happens |
| 23:39:37 | Watusimoto | close too 100 fps in debug mode |
| 23:40:13 | Watusimoto | let's try 500 |
| 23:40:30 | sam686 | i think the max robots limit is set to 256.. |
| 23:40:33 | Watusimoto | sending scoreboard crashed :-) |
| 23:52:21 | raptor | ha! i just hit the map q-party on your server sam686 |
| 23:52:31 | raptor | i forgot how nutso this map was |
| 23:58:24 | Watusimoto | btw, we're using method 2 |
| 23:58:26 | Watusimoto | http://stackoverflow.com/questions/8003941/how-to-share-reuse-a-lua-script-for-multiple-entities |