Jump to content




Optimization and micro-optimization


  • You cannot reply to this topic
43 replies to this topic

#21 HDeffo

  • Members
  • 214 posts

Posted 07 September 2015 - 04:37 PM

View Postawsumben13, on 07 September 2015 - 04:32 PM, said:

-snip-

you really seem to know your stuff in this case :D your comments will be tested and added to the guide. Thanks for your input

#22 クデル

  • Members
  • 349 posts

Posted 14 September 2015 - 07:24 AM

Thanks for sharing, will be definitely using some of these!

#23 blunty666

  • Members
  • 79 posts

Posted 14 September 2015 - 05:27 PM

View Postawsumben13, on 07 September 2015 - 04:32 PM, said:

With concatenation, it completely depends on how you're using it. If you're concatenating 5 things, using '..' is definitely quicker.
local s = "a" .. "b" .. "c" .. "d" .. "e"
-- will execute faster than
local s = table.concat { "a", "b", "c", "d", "e" }

However, you're generally concatenating things in a loop, something like this:
local s = ""
for i = 1, n do
	 s = s .. v
end
Don't do this! Lua (C-Lua at least) takes time to make sure you're not duplicating strings. When you create a string, it will check every other string in existence, and if it is equal to that string, use that instead of the new one. Absolutely no idea why it does this, but it does. This is a really slow operation, so creating strings repeatedly (like in the above example) is slow. In that case, using table.concat() is a lot better:
local t = {}
for i = 1, n do
	t[i] = v
end
local s = table.concat( t )
You'll notice ridiculous speed increases by doing this.

Just to clarify on this, if I know exactly how many strings I am concatenating together I should use the ".." operator in one long statement. But if I don't know how many there'll be, I should stick them all in a table and call table.concat at the end.

#24 Exerro

  • Members
  • 801 posts

Posted 14 September 2015 - 08:02 PM

View Postblunty666, on 14 September 2015 - 05:27 PM, said:

View Postawsumben13, on 07 September 2015 - 04:32 PM, said:

-snip-

Just to clarify on this, if I know exactly how many strings I am concatenating together I should use the ".." operator in one long statement. But if I don't know how many there'll be, I should stick them all in a table and call table.concat at the end.

Yeah pretty much, putting all of them in one line will be quickest because it's one instruction, but obviously you can't do that with a set of unknown length strings, so table.concat is the best way to go there.

#25 Bomb Bloke

    Hobbyist Coder

  • Moderators
  • 7,099 posts
  • LocationTasmania (AU)

Posted 15 September 2015 - 01:28 AM

Note that table.concat() starts to beat out .. concatenation once you reach a certain number of elements. I'm finding that number to be about ten to fifteen.

local minElements, maxElements, word, reps, counter = 1, 200, "\"asdf\"", 1000000, 0

print("Test: "..word..".."..word..".."..word.." vs table.concat()")

repeat
	local elements, curElements = {}, math.floor((maxElements - minElements) / 2 + minElements)
	for i = 1, curElements do elements[i] = word end
	
	local func1 = loadstring("local t = os.clock()    for i = 1, "..reps.." do    local s = "..table.concat(elements,"..").."    end   return os.clock() - t")()
	sleep(0)
	local func2 = loadstring("local e = {"..table.concat(elements,",").."}    local t = os.clock()    for i = 1, "..reps.." do    local s = table.concat(e)    end    return os.clock() - t")()
	sleep(0)

	counter = counter + 1
	print("Set "..counter..", "..curElements.." elements: "..func1.."s (..), "..func2.."s (concat)")

	if func1 > func2 then maxElements = curElements else minElements = curElements end
until func1 == func2 or minElements == maxElements


#26 Wergat

  • Members
  • 18 posts

Posted 11 January 2016 - 03:10 PM

Hi,
I am currently working on a big project that requires a bunch of optimizing and i am trying to reduce unwanted latency caused by my bad coding. It would be great if you could answer me a few questions i have about optimization in CC.

1) Is it recommend to cache a array's index amount?
Spoiler

2) Objects/Classes with "optimized" tables + get/setters or ...not?
Spoiler

3) Is there a difference when using t.abc or t["abc"]?
Spoiler

4) How fast is math.abs?
Spoiler

5) How about returning something after else
Spoiler

Edited by Wergat, 11 January 2016 - 03:10 PM.


#27 SquidDev

    Frickin' laser beams | Resident Necromancer

  • Members
  • 1,427 posts
  • LocationDoes anyone put something serious here?

Posted 11 January 2016 - 04:05 PM

Putting my answers in a spoiler to prevent massive walls of text:
Spoiler

You can use os.clock() to get the current computer time, then try an operation n times and calculate the change in time: I like to do something 100000000 times, but you may need to change this to prevent too long without yielding errors. You can also use Linux's time command to test in the command line.

Also: Just to quote myself from another topic:

View PostSquidDev, on 14 April 2015 - 03:09 PM, said:

Gotta agree with ElvishJerricco, you should work on making the algorithms you use as fast as possible. There is a famous quote which is pulled out every time someone asks about optimisation, to warn people off it.

Quote

Premature optimization is the root of all evil

However, the full quote is as follows:

Quote

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%.
Profile your code - write stubs for CC only methods and then use LuaProfiler to check how well it runs, optimise them, and repeat. Don't optimise for the sake of readability, and don't optimise a method because it might be slow, you haven't checked.

Edited by SquidDev, 11 January 2016 - 04:06 PM.


#28 Bomb Bloke

    Hobbyist Coder

  • Moderators
  • 7,099 posts
  • LocationTasmania (AU)

Posted 12 January 2016 - 12:24 AM

Definitely test things if in doubt - especially in this thread, people've got into the habit of digging up what other people have said online, but most people around the web are talking about C implementations of Lua - and ComputerCraft uses LuaJ, version 2.0.3.

Which is why using LuaProfiler will not give you accurate results, and nor will tests performed in many CC "emulators". You can get "hints" that way, but not accuracy. If you want best results within the CC environment, then test within the CC environment.

View PostWergat, on 11 January 2016 - 03:10 PM, said:

-- Using
t.amount
-- instead of
#t.elements
If i use the cached version X times, does it become faster?

Not as much as if you defined "amount" outside of the "t" table - pulling it out of there is slower than just defining it as a variable within the closest scope you can stick it in.

View PostWergat, on 11 January 2016 - 03:10 PM, said:

2) Objects/Classes with "optimized" tables + get/setters or ...not?

As a general rule, function calls are slower than table lookups. As SquidDev points out, if things get complicated, simply take advantage of your table's ability to act as a hashmap.

View PostWergat, on 11 January 2016 - 03:10 PM, said:

4) How fast is math.abs?

Without testing it, I'd say your function is faster (assuming you fix it per SquidDev's comment) - because math.abs() involves a lookup into the "math" table to get the "abs" function. Even this would lead to a speed increase:

local pos = math.abs

Of course, sticking the (val>0 and val or -val) construct directly into your code where you need it would be faster again - because that eliminates the function call, too.

#29 HDeffo

  • Members
  • 214 posts

Posted 23 January 2016 - 09:50 AM

Alright so now that I am back to computercraft again I ran my benchmark tests on all my suggestions again found something interesting... Basically every version of CC even small updates seems to change what saves time and what doesn't. Which would also explain the many debates on some of the methods depending on date and version. Since I come and go from the community and don't have as much time as an extensive list like this requires I would to enlist some help in keeping this updated as much as possible. Any volunteers? I feel a guide like this can be handy but obviously if its outdated it'll only cause more problems

#30 SquidDev

    Frickin' laser beams | Resident Necromancer

  • Members
  • 1,427 posts
  • LocationDoes anyone put something serious here?

Posted 23 January 2016 - 10:35 AM

View PostHDeffo, on 23 January 2016 - 09:50 AM, said:

Alright so now that I am back to computercraft again I ran my benchmark tests on all my suggestions again found something interesting... Basically every version of CC even small updates seems to change what saves time and what doesn't. Which would also explain the many debates on some of the methods depending on date and version. Since I come and go from the community and don't have as much time as an extensive list like this requires I would to enlist some help in keeping this updated as much as possible. Any volunteers? I feel a guide like this can be handy but obviously if its outdated it'll only cause more problems

Its odd that CC versions change the speed of the Lua VM. Dan doesn't really touch the LuaJ code. The latest versions have changed the string encoding methods when converting to and from Java - though this should only change performance when calling Java methods.

Do you have a GitHub repo with your benchmarks? I'd be happy to help put some more together. There are some other benchmarking programs around which might be worth looking into - but they focus more on comparing implementations of CC rather than actual code.

#31 Bomb Bloke

    Hobbyist Coder

  • Moderators
  • 7,099 posts
  • LocationTasmania (AU)

Posted 23 January 2016 - 12:10 PM

I've also noticed speed differences between builds of ComputerCraft, but how much of them have to do with "ComputerCraft" and how much of them have to do with the ton of other mods in the packs I was using is unclear to me. I believe it to be faster in general than it was when I first start using it, back at CC version 1.5.

Probably the most important changes involve rendering. For example, CC 1.6 introduced the window API and rigged all advanced computers to render through it by default. Then CC 1.74 came along and said API was heavily optimised using the new term.blit() command - it's now actually easier to get better rendering rates using the window API than without, depending on the complexity of what you're drawing.

(Tip: Use term.getCurrent() term.current() to get a hold of the window object your multishell tab is using, set it to invisible before performing complex render operations, then set it visible to have them blitted to the screen altogether in one go. Whoosh.)

But I suspect the event-handling backend of ComputerCraft has undergone some changes as well, and certainly, much has changed within the Lua-based APIs we use on a regular basis.

Edited by Bomb Bloke, 24 January 2016 - 12:20 AM.


#32 Lyqyd

    Lua Liquidator

  • Moderators
  • 8,465 posts

Posted 23 January 2016 - 06:29 PM

You may have meant term.current, unless the function name was changed in the most recent versions.

#33 Bomb Bloke

    Hobbyist Coder

  • Moderators
  • 7,099 posts
  • LocationTasmania (AU)

Posted 24 January 2016 - 12:16 AM

Yeah, that.

#34 HDeffo

  • Members
  • 214 posts

Posted 24 January 2016 - 06:34 PM

It would actually be good to know why versions are such a big impact on speed of individual functions as well. As for benchmarks I have been meaning to start a raw table of different tests but as I said I just hadn't had much free time lately and my schedule is only just now starting to free up. I currently use pepperfish for benchmarking however since that relies on a slight change in computercraft to enable the debug API I also test as a secondary with various timing functions such as os.clock. Both of these methods unfortunately do have some overhead to them however I am not technical enough to completely rebuild Computercraft while staying as true to function speeds as possible while adding profiling tests that wont add any overhead

#35 Sewbacca

  • Members
  • 450 posts
  • LocationStar Wars

Posted 11 July 2016 - 03:20 PM

When it is more useful to save the type of a var in a loop then asking it every time?
Example:
-- Saving the type
for i = 1, #tab do
  local typ = type(tab[i])
   if typ == 'table' then
  elseif typ == 'function' then
  <...>
end
-- predeclaring a var
local typ;
for i = 1, #tab do
  typ = type(tab[i])
   if typ == 'table' then
  elseif typ == 'function' then
  <...>
end
-- Asking the type every time
for i = 1, #tab do
   if type(tab[i]) == 'table' then
  elseif type(tab[i]) == 'function' then
  <...>
end


#36 Exerro

  • Members
  • 801 posts

Posted 11 July 2016 - 03:43 PM

Predeclaring a variable does nothing performance wise, it just changes which variables are 'visible' to sub-blocks (unless it's taken outside of a function in which case upvalues are involved, but not here).

The first example should be better every time: rather than getting a global (type), calling it for every block, then performing the comparison, you're just performing the comparison. Because of how Lua treats locals, there's no performance difference in these two:

local t = type( x )
print( t == v )

and

print( type( x ) == v )

...and that's because in the first case, the result of the call is stored in the register for 't' which is then used in the comparison, and in the latter, the result of the call is stored in a newly allocated register and then used in the comparison. Basically, they're both stored in a register, so it's all the same. However, if you were to use the result of type( x ) again, it'd be easier to just use the register for 't' rather than calling the function again.

At the same time, if you only have a few cases in that if statement, there's no significant advantage performance-wise to storing the type in a variable, so it's just down to what looks best and what's more readable really.

#37 Sewbacca

  • Members
  • 450 posts
  • LocationStar Wars

Posted 11 July 2016 - 04:40 PM

View PostExerro, on 11 July 2016 - 03:43 PM, said:

Predeclaring a variable does nothing performance wise, it just changes which variables are 'visible' to sub-blocks (unless it's taken outside of a function in which case upvalues are involved, but not here).

The first example should be better every time: rather than getting a global (type), calling it for every block, then performing the comparison, you're just performing the comparison. Because of how Lua treats locals, there's no performance difference in these two:

local t = type( x )
print( t == v )

and

print( type( x ) == v )

...and that's because in the first case, the result of the call is stored in the register for 't' which is then used in the comparison, and in the latter, the result of the call is stored in a newly allocated register and then used in the comparison. Basically, they're both stored in a register, so it's all the same. However, if you were to use the result of type( x ) again, it'd be easier to just use the register for 't' rather than calling the function again.

At the same time, if you only have a few cases in that if statement, there's no significant advantage performance-wise to storing the type in a variable, so it's just down to what looks best and what's more readable really.

Thank You =)

#38 CrazedProgrammer

  • Members
  • 495 posts
  • LocationWageningen, The Netherlands

Posted 11 July 2016 - 07:14 PM

Wow, never saw this topic, great job!
Although I already do around 90% of the optimizations you pointed out, this will help me make even faster code!
Thanks! :D

Edited by CrazedProgrammer, 11 July 2016 - 07:14 PM.


#39 The Crazy Phoenix

  • Members
  • 136 posts
  • LocationProbably within 2 metres of my laptop.

Posted 11 July 2016 - 10:58 PM

If I'm trying to initialize a table of dynamic or insanely high size, how can I optimize it such that LuaJ will initialize its array with the size I want? For example, a 1,048,576-sized array.

Whilst you address many interesting optimizations, to some, there really is no way of abusing them when using dynamic sizes.

#40 Bomb Bloke

    Hobbyist Coder

  • Moderators
  • 7,099 posts
  • LocationTasmania (AU)

Posted 12 July 2016 - 12:31 AM

It can be done, but I can't think of any method that'd actually be faster than simply sticking the values in one index at a time.

For example, an array the size you're talking about can be built within a twentieth of a second via a simple "for" loop - you'd need to go a lot larger than that before it'd start to matter.

Edited by Bomb Bloke, 12 July 2016 - 12:32 AM.






1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users