Jump to content




Verifying that two files are the same


  • You cannot reply to this topic
49 replies to this topic

#1 oeed

    Oversimplifier

  • Members
  • 2,095 posts
  • LocationAuckland, New Zealand

Posted 03 May 2014 - 08:16 AM

So, I've been getting a ton of crash reports from OneOS and a huge amount (over 80%) are from people mucking around with the code. Now, I don't have a problem with this really, it's just that they're meant to set a variable at the top of startup to true to prevent these reports. But they aren't doing that.

To try and conqueror this I've decided that the best way would be to detect if any of the system files have been altered and if so don't send the error report. I'm aware that there are lots of programs and AAP topics that hash strings and files, but none really specifically about this. However, as I've barely ever used or looked in to checksums or those sort of things I'm not really sure what the best way of doing this is. Storage space is also one of my main concerns, OneOS already uses about ~80% of the default storage space and these system files are about 100KB, so I can't have huge hashes of them. There's about 30 of them too. Speed is also an issue, I don't want it freezing for 10 seconds while it compares all the files.

So yea, what would the best way to do this be? I'm thinking of storing the hashes or what ever I use in a table correlating to the file names. Checking if the file sizes are different is another possibility, but it may not be accurate enough.

Edited by oeed, 03 May 2014 - 08:19 AM.


#2 logsys

  • Members
  • 171 posts

Posted 03 May 2014 - 08:22 AM

It would be faster if it uploaded that file to php and calculated the md5 hash there and return true or false from there. It would save some time

#3 Bomb Bloke

    Hobbyist Coder

  • Moderators
  • 7,099 posts
  • LocationTasmania (AU)

Posted 03 May 2014 - 08:32 AM

It sorta strikes me that an easier route would be to have error reporting off by default - if an error occurs that otherwise would be reported, you'd then throw up a message telling the user to consider enabling it. This'd give you an opportunity to force your "don't bug me with error reports if you're modifying my code" message somewhere it'll be seen before a user starts sending you "false positives".

#4 logsys

  • Members
  • 171 posts

Posted 03 May 2014 - 09:04 AM

I have an idea, place an empty space on the last line, if code was modified, the line will change of number

#5 awsmazinggenius

  • Members
  • 930 posts
  • LocationCanada

Posted 04 May 2014 - 02:47 AM

Except you can put multiple Lua statements/whatever in one line, so if someone really wanted to be a pain, or they just can't read or they know nothing about conventions, they could take that route.

[dumb moment]I would take the SHA1 hash of the computer's content and compare it to the hash of the latest commit on GitHub, since the readme gets downloaded and everything. You would upload the computer, hash it server-side (just in case someone modifies the hash-checking code) and compare it.[/dumb moment]
EDIT: I'm an idiot, that way, you couldn't save files on the computer. You would only upload the default files.

Edited by awsmazinggenius, 04 May 2014 - 02:52 AM.


#6 oeed

    Oversimplifier

  • Members
  • 2,095 posts
  • LocationAuckland, New Zealand

Posted 04 May 2014 - 02:56 AM

View Postawsmazinggenius, on 04 May 2014 - 02:47 AM, said:

Except you can put multiple Lua statements/whatever in one line, so if someone really wanted to be a pain, or they just can't read or they know nothing about conventions, they could take that route.

[dumb moment]I would take the SHA1 hash of the computer's content and compare it to the hash of the latest commit on GitHub, since the readme gets downloaded and everything. You would upload the computer, hash it server-side (just in case someone modifies the hash-checking code) and compare it.[/dumb moment]
EDIT: I'm an idiot, that way, you couldn't save files on the computer. You would only upload the default files.
If someone's modifying the hashing code then they're just being malicious, that's another issue. The people causing this just aren't reading.
Would it be possible to create a single hash from a few files? I don't want a hash of everything, people have their own documents and settings files. I don't want to hash it server side either, uploading about a megabyte of files isn't practical.

#7 awsmazinggenius

  • Members
  • 930 posts
  • LocationCanada

Posted 04 May 2014 - 03:03 AM

It should be, Git makes hashes from multiple files, so this should just be a matter of picking and choosing which files to hash.

#8 Bomb Bloke

    Hobbyist Coder

  • Moderators
  • 7,099 posts
  • LocationTasmania (AU)

Posted 04 May 2014 - 03:15 AM

If you choose to take the hashing route, you'll indeed be better off implementing the hashing code into the local copy of the script rather than sending the data to be hashed off to the internet somewhere. Either way you've got to read the complete set of data off the drive and it really doesn't take that long. My system can easily hash a hundred megs of data within a second or three.

This may sound obvious, but I'm getting the impression you're not familiar with file hashes and checksums, so I'll point out that the hashes themselves need not be more than a few dozen bytes at most (their size is typically not related to the size of the files being hashed). Whether you have one or multiple hashes for your entire set of files is completely up to you.

#9 oeed

    Oversimplifier

  • Members
  • 2,095 posts
  • LocationAuckland, New Zealand

Posted 04 May 2014 - 03:18 AM

View PostBomb Bloke, on 04 May 2014 - 03:15 AM, said:

This may sound obvious, but I'm getting the impression you're not familiar with file hashes and checksums, so I'll point out that the hashes themselves need not be more than a few dozen bytes at most (their size is typically not related to the size of the files being hashed). Whether you have one or multiple hashes for your entire set of files is completely up to you.

Ah ok, that makes a lot more sense actually. If were to use GravityScore's API for example, would the best way to combine all the files in to one be to put all the files in to one string and hash that, or is there a better way?

#10 awsmazinggenius

  • Members
  • 930 posts
  • LocationCanada

Posted 04 May 2014 - 03:38 AM

You should hash with SHA1, in the same way that Git hashes multiple files, so you can easily compare to the hash of the latest commit on GitHub. It will also allow you to not report the error if it is for an outdated version of OneOS.

#11 theoriginalbit

    Semi-Professional ComputerCrafter

  • Moderators
  • 7,332 posts
  • LocationAustralia

Posted 04 May 2014 - 03:40 AM

View Postawsmazinggenius, on 04 May 2014 - 03:38 AM, said:

It will also allow you to not report the error if it is for an outdated version of OneOS.
it would also be handy for an update checker.

Edited by theoriginalbit, 04 May 2014 - 03:40 AM.


#12 oeed

    Oversimplifier

  • Members
  • 2,095 posts
  • LocationAuckland, New Zealand

Posted 04 May 2014 - 03:52 AM

View Postawsmazinggenius, on 04 May 2014 - 03:38 AM, said:

You should hash with SHA1, in the same way that Git hashes multiple files, so you can easily compare to the hash of the latest commit on GitHub. It will also allow you to not report the error if it is for an outdated version of OneOS.
Ah I see, so I wouldn't even have to make a list of hashes. So I'd take it you'd just read the file content in to a string and hash that then compare it.

#13 HometownPotato

  • Members
  • 62 posts

Posted 04 May 2014 - 03:53 AM

Couldn't you just do something like this:

local x = fs.open("file1");
local c = x.readAll();
x.close();


x = fs.open("file2");
local c2 = x.readAll();
x.close();

if c == c2 then
end

It would check if the contents are the same

#14 oeed

    Oversimplifier

  • Members
  • 2,095 posts
  • LocationAuckland, New Zealand

Posted 04 May 2014 - 04:13 AM

View PostHometownPotato, on 04 May 2014 - 03:53 AM, said:

Couldn't you just do something like this:

local x = fs.open("file1");
local c = x.readAll();
x.close();


x = fs.open("file2");
local c2 = x.readAll();
x.close();

if c == c2 then
end

It would check if the contents are the same
The context behind what I'm trying to do is important. I don't have the original copy of the file nor can I download it really due to both file size issues and the number of files.

#15 logsys

  • Members
  • 171 posts

Posted 04 May 2014 - 10:40 AM

In PHP there is a md5 hash calculator.. put all the content into one line and send it to php. From php, the md5 will have to match the other md5 value. after that, return true or false.

#16 viluon

  • Members
  • 183 posts
  • LocationCzech Republic

Posted 04 May 2014 - 11:05 AM

Hmm, wouldn't hashing all the files one by one and then hashing the hashes be better?

You could then compare the two hashes - local and github

Edited by viluon, 04 May 2014 - 11:06 AM.


#17 awsmazinggenius

  • Members
  • 930 posts
  • LocationCanada

Posted 04 May 2014 - 01:51 PM

I'm not sure exactly how Git goes about SHA1ing multiple files - whether it just concatenates them, or hashes them all and then hashes the hashes, or if it concatenates them but puts some character in between the files to separate them, etc. - but you would want to do it how Git does it, which might take a little research. I'll see if I have time to look into it.

EDIT: The point is not to have to hash the local files and the GitHub files, it is to be able to compare to the already available SHA1 hash of the latest commit on GitHub.

Edited by awsmazinggenius, 04 May 2014 - 01:53 PM.


#18 Saldor010

  • Members
  • 467 posts
  • LocationThe United States

Posted 04 May 2014 - 01:54 PM

I think I have an idea. How about, whenever someone opens up a core file of OneOS, it will pop up with a message saying "Are you about to modify OneOS's code? If so, I suggest you turn off "Auto Report"."

#19 MKlegoman357

  • Members
  • 1,170 posts
  • LocationKaunas, Lithuania

Posted 04 May 2014 - 01:57 PM

View PostJiloacom, on 04 May 2014 - 01:54 PM, said:

I think I have an idea. How about, whenever someone opens up a core file of OneOS, it will pop up with a message saying "Are you about to modify OneOS's code? If so, I suggest you turn off "Auto Report"."

And what if I edit those files outside of OneOS, outside of CC?

#20 Saldor010

  • Members
  • 467 posts
  • LocationThe United States

Posted 04 May 2014 - 02:10 PM

View PostMKlegoman357, on 04 May 2014 - 01:57 PM, said:

View PostJiloacom, on 04 May 2014 - 01:54 PM, said:

I think I have an idea. How about, whenever someone opens up a core file of OneOS, it will pop up with a message saying "Are you about to modify OneOS's code? If so, I suggest you turn off "Auto Report"."

And what if I edit those files outside of OneOS, outside of CC?

Then oh well. It's not OneOS's job to track all of that.





1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users