{"id":1248,"date":"2011-10-28T12:03:49","date_gmt":"2011-10-28T16:03:49","guid":{"rendered":"http:\/\/cd34.com\/blog\/?p=1248"},"modified":"2011-10-28T13:18:30","modified_gmt":"2011-10-28T17:18:30","slug":"quick-python-search-and-replace-script","status":"publish","type":"post","link":"https:\/\/cd34.com\/blog\/programming\/python\/quick-python-search-and-replace-script\/","title":{"rendered":"Quick Python search and replace script"},"content":{"rendered":"<p>Have a client machine that is a little loaded that has a ton of modified files. Normally we just restore off the last backup or the previous generation backup, but, over 120k files since June 2011 have been exploited. Since the machine is doing quite a bit of work, we need to throttle our replacements so that we don&#8217;t kill the server.<\/p>\n<pre>\r\n#!\/usr\/bin\/python\r\n\"\"\"\r\n\r\nQuick search and replace to replace an exploit on a client's site while\r\ntrying to keep the load disruption on the machine to a minimum.\r\n\r\nReplace the variable exploit with the code to be replaced. By default, \r\nthis script starts at the current directory. max_load controls our five\r\nsecond sleep until the load drops.\r\n\r\n\"\"\"\r\n\r\nimport glob\r\nimport os\r\nimport re\r\nimport time\r\n\r\npath = '.'\r\nmax_load = 10\r\n\r\nexploit = \"\"\"\r\n&lt;script>var i,y,x=\"3cblahblahblah3e\";y='';for(i=0;i<x .length;i+=2){y+=unescape('%'+x.substr(i,2));}document.write(y);&lt;\/script>\r\n\"\"\".strip()\r\n\r\nfile_exclude = re.compile('\\.(gif|jpe?g|swf|css|js|flv|wmv|mp3|mp4|pdf|ico|png|zip)$', \\\r\n                          re.IGNORECASE)\r\n\r\ndef check_load():\r\n    load_avg = int(os.getloadavg()[0])\r\n    while load_avg > max_load:\r\n        time.sleep(30)\r\n        load_avg = int(os.getloadavg()[0])\r\n\r\ndef getdir(path):\r\n    check_load()\r\n    for file in os.listdir(path):\r\n        file_path = os.path.join(path,file)\r\n        if os.path.isdir(file_path):\r\n            getdir(file_path)\r\n        else:\r\n            if not file_exclude.search(file_path):\r\n                process_file(file_path)\r\n\r\ndef process_file(file_path):\r\n    file = open(file_path, 'r+')\r\n    contents = file.read()\r\n    if exploit in contents:\r\n        print 'fixing:', file_path\r\n        contents = contents.replace(exploit, '')\r\n        file.truncate(0)\r\n        file.seek(0, os.SEEK_SET )\r\n        file.write(contents)\r\n    file.close()\r\n\r\ngetdir(path)\r\n<\/x><\/pre>\n<p>Thankfully, since this server is run as <a href=\"\/web-security\/setuid-versus-www-data\/\">www-data rather than SetUID<\/a>, the damage wasn&#8217;t as bad as it could have been.<\/p>\n<div style=\"float:left;\">\n<div id=\"fb-root\"><\/div>\n<fb:like href=\"https:\/\/cd34.com\/blog\/programming\/python\/quick-python-search-and-replace-script\/\" width=\"250\" send=\"false\" show_faces=\"false\" layout=\"button_count\" action=\"recommend\"><\/fb:like>\n<\/div><div style=\"clear:both;\"><\/div>","protected":false},"excerpt":{"rendered":"<p>Have a client machine that is a little loaded that has a ton of modified files. Normally we just restore off the last backup or the previous generation backup, but, over 120k files since June 2011 have been exploited. Since the machine is doing quite a bit of work, we need to throttle our replacements [&hellip;]<\/p>\n<div style=\"float:left;\">\n<div id=\"fb-root\"><\/div>\n<fb:like href=\"https:\/\/cd34.com\/blog\/programming\/python\/quick-python-search-and-replace-script\/\" width=\"250\" send=\"false\" show_faces=\"false\" layout=\"button_count\" action=\"recommend\"><\/fb:like>\n<\/div><div style=\"clear:both;\"><\/div>","protected":false},"author":15,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[139,138],"class_list":["post-1248","post","type-post","status-publish","format-standard","hentry","category-python","tag-exploit","tag-javascript"],"_links":{"self":[{"href":"https:\/\/cd34.com\/blog\/wp-json\/wp\/v2\/posts\/1248","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cd34.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cd34.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cd34.com\/blog\/wp-json\/wp\/v2\/users\/15"}],"replies":[{"embeddable":true,"href":"https:\/\/cd34.com\/blog\/wp-json\/wp\/v2\/comments?post=1248"}],"version-history":[{"count":7,"href":"https:\/\/cd34.com\/blog\/wp-json\/wp\/v2\/posts\/1248\/revisions"}],"predecessor-version":[{"id":1255,"href":"https:\/\/cd34.com\/blog\/wp-json\/wp\/v2\/posts\/1248\/revisions\/1255"}],"wp:attachment":[{"href":"https:\/\/cd34.com\/blog\/wp-json\/wp\/v2\/media?parent=1248"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cd34.com\/blog\/wp-json\/wp\/v2\/categories?post=1248"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cd34.com\/blog\/wp-json\/wp\/v2\/tags?post=1248"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}