Aria2C Batch File For Quick Download

Copy to following CMD batch-file content into a new file,
and name it aria2download.cmd.

Place the file in the same folder as your aria2c.exe,
make sure it is your system-PATH,

and you can now enter aria2download "http..your-url..."
for a quick file-download to the current folder you are in.

All the important-switches are already set, such as maximum split value, and maximum connections per-server for ultra-fast download and an extra permissive-SSL to allow you to download from secure-servers without certificate verifying and a browser-like useragent and referrer value to generate a compatible download session with servers that often will block download-managers. You can easily personalise the batch-file to include an additional headers, authentications, etc…

Continue reading

Make Chrome Faster

1. Use Chrome/Chromium Of x64bit, if your machine can handle it.

2. Install those both official Chrome-extensions:
https://chrome.google.com/webstore/detail/dalkfakmooljfejnddeaibdkgbbogpea
https://chrome.google.com/webstore/detail/pcgokfnmpaefeofjpbicabmcadpcnhon

3. Access chrome://flags/ and change the following values,
you may easily reach either of those by copy&&paste them into the address-bar.

👉 chrome://flags/#num-raster-threads ➧4
👉 chrome://flags/#default-tile-width ➧1024
👉 chrome://flags/#default-tile-height ➧1024

👉 chrome://flags/#enable-fast-unload ➧Enable
👉 chrome://flags/#smooth-scrolling ➧Disable
👉 chrome://flags/#enable-quic ➧Enable
👉 chrome://flags/#enable-zero-copy ➧Enable
👉 chrome://flags/#enable-site-per-process ➧Enable
👉 chrome://flags/#v8-cache-options ➧’Cache V8 parser data.’
👉 chrome://flags/#v8-cache-strategies-for-cache-storage
➧’Aggressive’
👉 chrome://flags/#enable-scroll-anchoring ➧Disable
👉 chrome://flags/#enable-pointer-events ➧Enable
👉 chrome://flags/#passive-listener-default ➧’Force All True’
👉 chrome://flags/#document-passive-event-listeners ➧Enable
👉 chrome://flags/#passive-event-listeners-due-to-fling ➧Enable
👉 chrome://flags/#expensive-background-timer-throttling ➧Enable
👉 chrome://flags/#enable-nostate-prefetch ➧’Enabled Prerender’
👉 chrome://flags/#enable-resource-prefetch ➧’Enable Prefetching’
👉 chrome://flags/#delay-navigation ➧Disable

4. Just before the restart required in stage [3] (above),
close all other tabs (other than chrome://flags/),
access chrome://net-internals/#dns in a new-tab, and click the ‘clear host cache’, you can safely close the chrome://net-internals/#dns now, and back in the chrome://flags/ click on the big-blue ‘relaunch now’ button.

5. Run Chrome with the following command-line switches:

--enable-accelerated-vpx-decode="0x03" --prefetch-search-results --disable-pinch --disable-in-process-stack-traces --enable-tcp-fastopen --enable-threaded-compositing --enable-gpu-scheduler --use-double-buffering --enable-hardware-overlays --enable-partial-raster --disable-speech-api --ipc-connection-timeout="90"  --enable-gpu-memory-buffer-compositor-resources --enable-gpu-memory-buffer-video-frames --enable-native-gpu-memory-buffers --disable-payment-request --disable-3d-apis --disable-logging --disable-presentation-api --enable-rgba-4444-textures --v8-cache-options="data" --v8-cache-strategies-for-cache-storage="aggressive"

you may use a .bat or .cmd batch file if it is easier for you,
or use my https://github.com/eladkarako/iniRun project.
here is my ini file (for chromium :])

[Information]
Parent_Folder=C:\Users\Elad\AppData\Local\Chromium\Application

Arguments=--force-device-scale-factor="1.2" --enable-accelerated-vpx-decode="0x03" --allow-outdated-plugins --ppapi-flash-path="C:\Windows\System32\Macromed\Flash\PEPFLA~1.DLL" --ppapi-flash-version="24.0.0.221" --prefetch-search-results --enable-lcd-text --enable-font-antialiasing=1 --ppapi-antialiased-text-enabled=1 --no-referrers --reduced-referrer-granularity --force-ui-direction=ltr --enable-pepper-testing --keep-alive-for-test --disable-pinch --ipc-connection-timeout="90" --disable-hang-monitor --disable-in-process-stack-traces --enable-tcp-fastopen --enable-threaded-compositing --enable-grouped-history --ash-md=enabled --material-design-ink-drop-animation-speed="fast" --show-md-login --top-chrome-md="material" --secondary-ui-md="material" --enable-gpu-scheduler --show-md-login --disable-md-oobe --use-double-buffering --desktop-window-1080p --enable-hardware-overlays --enable-partial-raster --disable-speech-api --enable-gpu-memory-buffer-compositor-resources --enable-gpu-memory-buffer-video-frames --enable-native-gpu-memory-buffers --disable-payment-request --disable-3d-apis --disable-logging --disable-presentation-api --enable-rgba-4444-textures --v8-cache-options="data" --v8-cache-strategies-for-cache-storage="aggressive" --enable-threaded-compositing --no-referrers


;;// Overrides the timeout, in seconds, that a child process waits for a
;;// connection from the browser before killing itself.
;;const char kIPCConnectionTimeout[]          = "ipc-connection-timeout";


Full_Path=C:\Users\Elad\AppData\Local\Chromium\Application\chrome.exe

Run_Mode=SW_SHOWMAXIMIZED

search this blog for newer ways to download all recent command-line switches and this might also help: https://github.com/eladkarako/Chrome-Command-Line-Switches :]

Enjoy!

Stuff You Should Exclude From Your Anti-Virus

It should be perfectly safe to exclude some folders from your anti-virus processing.
Have a look below, my notes/comments should help you understand the generic state-of mind and then you can apply the same reasoning to modify or add other items.

Continue reading

ExifTool – Ultimate Batch To Remove All Tags, From Infinite List Of Files, With Verbose Status And Progress Report

ExifTool is a free and open-source software program for reading, writing, and manipulating image, audio, video, and PDF metadata. It is platform independent, available as both a Perl library (Image::ExifTool) and command-line application. ExifTool is commonly incorporated into different types of digital workflows and supports many types of metadata including Exif, IPTC, XMP, JFIF, GeoTIFF, ICC Profile, Photoshop IRB, FlashPix, AFCP and ID3, as well as the manufacturer-specific metadata formats of many digital cameras.

http://www.sno.phy.queensu.ca/~phil/exiftool/
https://en.wikipedia.org/wiki/ExifTool
https://sourceforge.net/projects/exiftool/
https://downloads.sourceforge.net/project/exiftool/exiftool-10.48.zip?r=&ts=1491744058&use_mirror=netix

The following example is a batch for Windows operation-system, but it can be easily migrated to any other supported OS, to use it, just mark a bunch of files (any amount! it does not limited to usual batch-queue-limit of 9) and drag&drop them over the batch-file.

Plus it will work on any file, naturally file-types that are not supported by the ExifTool will be ignored (skipped), so feel free to just use [CTRL]+[A] to select all the files without pin-point-select just the ones you need.. or unsure if they will be supported, if it will, it will.. and if it won’t, it won’t.. :]

remove_metadata.cmd:

@echo off
:LOOP
::has argument ?
if ["%~1"]==[""] (
  echo done.
  goto END;
)
::argument exist ?
if not exist %~s1 (
  echo not exist
  goto NEXT;
)
::file exist ?
echo exist
if exist %~s1\NUL (
  echo is a directory
  goto NEXT;
)
::OK
echo is a file

set FILE_INPUT="%~s1"
set FILE_OUTPUT="%~d1%~p1%~n1_fixed%~x1"

call exiftool -progress -verbose -ignoreMinorErrors -XMPToolkit="" -all="" -trailer:all="" "%FILE_INPUT%"


:NEXT
shift
goto LOOP

:END
pause

aria2c Sample – Chromium Command-Line Switch Updater

Getting the most updated command-line switches for Chromium (Google-Chrome base code) is always a work in progress,
since this is a `live code` you can never say “Ok, I’m Done”, there will always be a new one, or old one retired from being actively used in the, well.., actual code.
So if you’re relaying on command-line switches in your scripts, or just want to try out new features before the `bleed into` the actual Google-Chrome main version, you probably want to bookmark this article :]]

Continue reading

f6drivers.img

Hiren's BootCD‘s menu.lst includes the following part:

title Mini Windows Xp\nRun Antivirus and other windows programs
# example password: test 
# password --md5 $1$gNe0$KZPOE8wNbTNSKOzrajuoB0
find --set-root /HBCD/XP/XP.BIN
#map --mem /HBCD/XP/f6drivers.img (fd0)&&map --hook
chainloader /HBCD/XP/XP.BIN

The f6drivers.img line is marked out (the file does not actually exist),

You may download the x86 drivers from: #1 or #2, WinImage and just place the content in a new .IMA file, next- rename the extension from .IMA to .IMG.

Download a ready to use .IMG files:
Continue reading

Reverse Engineer – WD External Storage Driver Setup (x64/x86)

  • The WD External Storage Driver suitable for the entire line of myPassport USB Hard-Drives.
  • This article allows you to download both the original Windows Installer MSI file, and its content, (drivers only) and install them yourself.
    You may also embed the drivers to prepare a Windows setup through slipstream (just an idea).
  • The drivers are not modified in any way, and are digitally signed by WD certificate, so you may install them with no particular issue.
    Since it’s signed, there is No need to run those two commands:

    • bcdedit.exe -set loadoptions DDISABLE_INTEGRITY_CHECKS
    • bcdedit.exe -set TESTSIGNING ON

Driver’s version

  • from INF: 1.0.0009.0
  • from files: 1.0.7.2 (actual)

Driver’s created

  • INF reports: Jan 19th 2011
  • from files: February 16th 2011 (actual)

x86
Original.
wd_installer_x86
WD SES Driver Setup (x86).msi [678KB]

Drivers only.
wd_ses_driver_only_x86
WD SES Drivers Only (x86).zip [17KB]

Extract anywhere, installed by right click, install.
Or run C:\System32\InfDefaultInstall.exe "wdcsam.inf".


x64
Driver’s version: 1.0.0009.0, Actual version (from files) 1.0.7.2
Driver Created: Jan 19th 2011, Actual date (from files) February 16th 2011

Original.
wd_installer_x86
WD SES Driver Setup (x64).msi [1.03MB]

Drivers only.
wd_ses_driver_only_x64
WD SES Drivers Only (x64).zip [17KB]

Extract anywhere, installed by right click, install.
Or run C:\System32\InfDefaultInstall.exe "wdcsam.inf".

Continue reading

Solved: ThinApp Registry Export

Say you want to convert VMWare’s ThinApp (formerly known as Thinstall) Registry,
From the sanbox-virtual format (those text files in your capture/project-name folder) To something a human can be easily read, say, a windows registry file.

Why? well.. maybe you’ve just captured a setup process in-order to check what has been changed on your operation-system.

A really common reason to use ThinApp without actually building anything at the end, at least among the VM-savvy engineers is for the sake of tracking the changes to the operation-system, in hope of simplifying installations, in cases all you may need is a pair of exe-and-reg files and no need for an overkill of sandboxing an entire application + virtualapp-engine.
~
ThinApp does a very good job of capturing even the deepest registry changes (including those of permission limitation or ones which does not “really exist”, such as soft symbolic-linked keys for example under HKEY_USERS (which are common enough).

Another way of comparing registry changes including dumping the entire registry (before and after..) and comparing the the .REG files using a program such as BeyondCompare.

A similar method but somehow slightly easier is the usage of Registry-Workshop, and the “before” and “after” snapshots feature, following the built-in compare-engine which is pretty much a nice wrap around the same thing (above) except using the program’s internal-compare engine which also allows to jump-into the inspected values, sync changes, etc…

So..

I’ve captured a nice little freeware called foxit PDF
and got the familiar folder structure (before building anything!)

icompile.eladkarako.com_thinapp_thinstall_virtual_registry_tvr_convert_to_reg_key1

If you’ll have a look (just a look, don’t worry..)
inside the build.bat batch-file,
You’ll see part of the command we’ll going to use, which is actually part of creating the virtual-sandbox,
in-particularity- the REGISTRY part:

icompile.eladkarako.com_thinapp_thinstall_virtual_registry_tvr_convert_to_reg_key2_batch_look

After the hint, it is time for the solution walk-through:

  1. Under your ThinApp folder (same level where you’ll find the Setup Capture.exe file create a new folder, named reg_convert.
    icompile.eladkarako.com_thinapp_thinstall_virtual_registry_tvr_convert_to_reg_key_folder1
  2. Under reg_convert create two folder named in and out
    icompile.eladkarako.com_thinapp_thinstall_virtual_registry_tvr_convert_to_reg_key_folder2
  3. Under in folder, copy the Package.ini from your captured-package,
    or use this generic, minimal Package.ini

    you only need the part related to setup-capture, mostly the versioning of the capturing engine of ThinApp, and the code-page [language] of 1033 [English] might be useful in-case you have registry keys with foreign-characters, which in this case you might want to have a look at the values of one of your original captured Package.ini files, or Google it.. 1037 is Hebrew :]

    [Compression]
    CompressionType=None
    
    [Isolation]
    DirectoryIsolationMode=Merged
    
    [BuildOptions]
    ;-------- Parameters used only during Setup Capture  ----------
    AccessDeniedMsg=You are not currently authorized to run this application. Please contact your administrator.
    CapturedUsingVersion=5.1.0-2079447
    CaptureProcessorArchitecture=0
    CapturePlatformVersion=0501
    CaptureOSArchitecture=32
    CaptureOSMajorVersion=5
    CaptureOSMinorVersion=1
    CaptureOSSuite=256
    CaptureOSProductType=1
    CaptureOSCSDVersion=Service Pack 3
    CaptureOSProcessorCoreCount=2
    CaptureOSRemoteSession=0
    CaptureOSVMwareVM=1
    OutDir=bin
    
    AnsiCodePage=1255
    LocaleIdentifier=1033
    
    AltArchitectureShortcut=0
    QualityReportingEnabled=0
    

    icompile.eladkarako.com_thinapp_thinstall_virtual_registry_tvr_convert_to_reg_key_folder3

  4. Still under the in folder, you should now copy (just) the registry files (.TXT) from your captured project.

    You do not have to copy them all, and you are well encouraged to make them as small and lite as possible by removing values that are not needed. The smaller and fewer they’ll be, the faster the entire process will be completed.

    For example I’ll going to remove the following “keys/values/data” since they are not needed or even related to the package itself, even more than that, those can collide with the operation-system’s more recent-values (in-case I’ll be building the project later..)
    icompile.eladkarako.com_thinapp_thinstall_virtual_registry_tvr_convert_to_reg_key_can_make_registry_txt_files_smaller

  5. At this point, we will generate a virtual-sandbox holding just the registry values (no files), using the vregtool.exe command.

    1. open up CMD and navigate to where you have your vregtool.exe
      (same place you’ll have reg_convert).
      cd c:\.......\ThinApp\
    2. run vregtool.exe reg_convert\out\reg.tvr ImportDir reg_convert\in\,
      icompile.eladkarako.com_thinapp_thinstall_virtual_registry_tvr_convert_to_reg_key_generate_tvr_of_just_registry_values2
      You may ignore warnings, or remove any extra-empty lines at the bottom of the txt files,
      It will take few seconds, and you’ll find the tvr file under the out folder.
      icompile.eladkarako.com_thinapp_thinstall_virtual_registry_tvr_convert_to_reg_key_generate_tvr_result
      1. Almost done, we will, now, extract the actual registry key (in the standard windows format) from the virtual-sandbox, exporting it to the same out folder.

        run:vregtool.exe reg_convert\out\reg.tvr ExportReg reg_convert\out\registry.reg
        icompile.eladkarako.com_thinapp_thinstall_virtual_registry_tvr_convert_to_reg_key_generate_final_registry_key_result
        You’ll find your result file under the out folder as well as the old tvr file.
        icompile.eladkarako.com_thinapp_thinstall_virtual_registry_tvr_convert_to_reg_key_generate_final_registry_key_result_in_folder

      2. Done.

        naturally a cleanup will be required in-order to use the same method of converting txttvrreg,
        remove the txt files under in folder, keep Package.ini file there to be used the next time.
        you can safely remove the entire content of the out folder (the reg.tvr, and once you’ll be done with it- the registry.reg file).

        Naturally a batch file can quite easily be generated,
        You can make one to drag&drop the entire captured-folder, allowing automated copy, generating the copying back the result to your captured-project, in same place as the txt files, to keep things organised by project. :)

      Hope it helps ! :]

      Happy engineering day :]]

Chromium API Keys

When you build Chromium yourself, or download a Chromium nightly-build, you need to provide API-key, a secret-phrase and a client-id,
running Chromium without either of those three phrases will seriously compromise Chromium’s functionality.

The is a sort of “inner-Google” page on how to generate, and use the API-Keys needed to make Chromium a properly, fully functional Chrome-compatible browser (including Syncing your bookmarks and other stuff..)

There is an issue, however, with that page (https://www.chromium.org/developers/how-tos/api-keys) which has not been-updated for a while, especially on the names of the API-services, one, needs to enable in-order to use Chromium in a fully functional mode,

icompile.eladkarako.com_google_console_api_services_list_helper_for_chromium_api_keys

I’ve gathered those services (including a direct link for you to use in-order to activate it)
the following list (Google API-console) is ordered as:

– Name.
– Description.
– URL.

the list of API-services is sorted (A,B,C..) by the Name of the API-service.

Continue reading

CCleaner INI Rip

CCleaner has internal resources, such as EXE, INI, string tables for languages, PNG and Bitmaps,
Here is the INI section, which might help you understand what is behind each cleanup section,
the format is the same format as the custom INI you can add or the ccleaner.ini that stores the settings for you (on portable version it sits under the same folder as ccleaner.exe)
Continue reading

Microsoft URLS Used By Background Services You Can-Not Block With HOSTS-File

Blocking or routing URL-addresses using the HOSTS-file is quite and easy practice to master,
Here are few of Microsoft’s URL-addresses, used by background-services,
That would not “pass-through” the standard-HOSTS file DNS resolution, which means, that
trying to “block” (127.0.0.* with no internal-server running, or 0.0.0.0) or redirect those will do little to none..

  1. www.msdn.com
  2. msdn.com
  3. www.msn.com
  4. msn.com
  5. go.microsoft.com
  6. msdn.microsoft.com
  7. office.microsoft.com
  8. microsoftupdate.microsoft.com
  9. wustats.microsoft.com
  10. support.microsoft.com
  11. www.microsoft.com
  12. microsoft.com
  13. update.microsoft.com
  14. download.microsoft.com
  15. microsoftupdate.com
  16. windowsupdate.com
  17. windowsupdate.microsoft.com

You can find the dnsapi.dll under this path: C:\Windows\System32\dnsapi.dll

icompile.eladkarako.com_microsoft_urls_used_by_background_services_you_can_not_block_with_hosts_file

You can still block those using this router trick: iCompile – Easy Router Ad-Block

Since it still uses the DNS-engine, just ignores any redirects from HOSTS.

JKDefrag

JKDefrag is a disk defragmenter and optimizer for Windows 2000/2003/XP/Vista/2008 compatible with x86/x64 platforms architecture.
It is completely automatic and very easy to use, fast, low overhead, with several optimization strategies, 
and can handle floppies, USB disks, memory sticks, and anything else that looks like a disk to Windows.

Included are a Windows version, a commandline version (for scheduling by the task scheduler or for use from administrator scripts), a screensaver version, a DLL library (for use from programming languages), versions for Windows X64, and the complete sources.

Why use this defragger instead of the standard Windows defragger?
 	- Much faster.
 	- Totally automatic, extremely easy to use.
 	- Optimized for daily use.
 	- Disk optimization, several strategies.
 	- Directories are moved to the beginning of the disk.
 	- Reclaims MFT reserved space after disk-full.
 	- Maintains free spaces for temporary files.
 	- Can defragment very full harddisks.
 	- Can defragment very large files.
 	- Can defragment individual directories and files.
 	- Can be run automatically with the Windows Scheduler.
 	- Can be used from the commandline.
 	- Can be used as a screen saver.
 	- Can be run from cdrom or memory stick.
 	- Sources available, can be customized.
 	- Supports x86/x64 architecture.

JKDefrag is an open source software by Jeroen Kessels,
this is the "3.36" version, since from version 4, 
it was changed to "MyDefrag", which is a closed source freeware.

Title: JkDefrag 3.36
Filename: JkDefrag-3.36.zip
File size: 467KB (478,618 bytes)
Requirements: Windows 2000 / XP / Vista / Windows 7 / Windows 8 / Windows 10 / Windows 10 64-bit
Languages: Multiple languages
License: Freeware
Date added: September 1, 2008
Author: J C Kessels

JkDefrag-3.36.zip


The source can be found here: https://github.com/eladkarako/JKDefrag-Original.

CMD Ninja – Relative To Fully Qualified Path And Other File Properties Without Directing Relative Path To Another CMD-File As An Argument

Given the following directory-tree
icompile.eladkarako.com_cmd_file_path_properties_short_relative_file_name_directory_folder_without_sending_to_another_cmd_file_as_argument_args

I want to get some information on apktool_2.0.2.jar
for example fully-qualified short (old dos 8.3 format that is compatible with old Java argument’ing) path,

here is a snippet containing it,
this ‘_’ prefix is advisable, the double ‘%’ is to escape the loop’s variable, the ‘~’ is to strip any \’ \” wrapping chars around the path (best practice, always use it), the reason ‘_drive’ phrase has (also) ‘s’ in it is to make the result more letter-case consistent (the 8.3 format uses Upper-Case while if you’ll run CMD console and browse or change-dir using a Lower-Case drive letter it will be embodied in the result).

@echo off
set relative=.\apktool_2.0.2.jar

for /f %%a in ("%relative%")do (set "_full=%%~fa"     )
for /f %%a in ("%relative%")do (set "_full83=%%~fsa"  )
for /f %%a in ("%relative%")do (set "_drive=%%~dsa"   )
for /f %%a in ("%relative%")do (set "_path=%%~pa"     )
for /f %%a in ("%relative%")do (set "_path83=%%~psa"  )
for /f %%a in ("%relative%")do (set "_name=%%~na"     )
for /f %%a in ("%relative%")do (set "_name83=%%~nsa"  )
for /f %%a in ("%relative%")do (set "_ext=%%~xa"      )
for /f %%a in ("%relative%")do (set "_ext83=%%~xsa"   )
for /f %%a in ("%relative%")do (set "_att=%%~aa"      )
for /f %%a in ("%relative%")do (set "_time=%%~ta"     )
for /f %%a in ("%relative%")do (set "_size=%%~za"     )

echo %_full%
echo %_full83%
echo %_drive%
echo %_path%
echo %_path83%
echo %_name%
echo %_name83%
echo %_ext%
echo %_ext83%
echo %_att%
echo %_time%
echo %_size%

::------------------------------------------------------------::
::  relative  < ->  .\apktool_2.0.2.jar                        ::
::------------------------------------------------------------::
::  _full     < ->  D:\DOS\android\bin\apktool_2.0.2.jar       ::
::  _full83   < ->  D:\DOS\android\bin\APKTOO~1.JAR            ::
::  _drive    < ->  D:                                         ::
::  _path     < ->  \DOS\android\bin\                          ::
::  _path83   < ->  \DOS\android\bin\                          ::
::  _name     < ->  apktool_2.0.2                              ::
::  _name83   < ->  APKTOO~1                                   ::
::  _ext      < ->  .jar                                       ::
::  _ext83    < ->  .JAR                                       ::
::  _att      < ->  --a------                                  ::
::  _time     < ->  10/14/2015 02:06 PM                        ::
::  _size     < ->  6329931                                    ::
::------------------------------------------------------------::

CMD Ninja – Unlimited Arguments Processing, Identifying If Exist In File-System, Identifying If File Or Directory

@echo off

:loop
      ::-------------------------- has argument ?
      if ["%~1"]==[""] (
        echo done.
        goto end
      )
      ::-------------------------- argument exist ?
      if not exist %~s1 (
        echo not exist
      ) else (
        echo exist
        if exist %~s1\NUL (
          echo is a directory
        ) else (
          echo is a file
        )
      )
      ::--------------------------
      shift
      goto loop
      
      
:end

pause

save it as identifier.cmd
it can identify an unlimited arguments (normally you are limited to %1%9), just remember to wrap the arguments with inverted-commas, or use 8.3 naming, or drag&drop them over (it automatically does either of above).

this allows you to run the following commands:
identifier.cmd c:\windows
and to get

exist
is a directory
done

identifier.cmd "c:\Program Files (x86)\Microsoft Office\OFFICE11\WINWORD.EXE"
and to get

exist
is a file
done

and multiple arguments (of course this is the whole-deal..)
identifier.cmd c:\windows\system32 c:\hiberfil.sys "c:\pagefile.sys" hello-world
and to get

exist
is a directory
exist
is a file
exist
is a file
not exist
done.

naturally it can be a lot more complex,
but nice examples should be simple and minimal.

also posted at StackOverflow:

Fast download and parse http-archive requests file

download a full-length file from archive.org, trimming it (with any method you’ll like to) to about 300kb gzip file, which are about 2.5mb txt file which are about 2570 requests (line separated), you can open it with a notepad2/editplus/notepad++ editor that can handle those sort-of-file-sizes,
and remove the last line, which is usually partial due to the trimming (..the step before).

parsing is quite easy, perl/python and even bash has a sample on extracting tsv/csv data,
you can download here an easy to use (fully working) example (that would probably will also work for the untrimmed file): http_archive_parse.rar

2013-11-26_135041

2013-11-26_135023

this includes automated download, and parsing. use the cmd file for windows (you should modify it a bit to be used in linux, but it can be done easily). the sh file is a “line-parser” by field to the CSV. note #1: the last line in requests.txt is removed since it will be corrupted (the gzip file, line based compressed, so we just removing the last line to make it valid even though the file is trunked)

note #2: see http://httparchive.org/downloads.php for more details.

note #3: the CSV does not includes header, so I’ve added it taking it from the MYSQL scheme available here:
http://httparchive.org/downloads/httparchive_schema.sql
(db-table name: “requests”)

—-

how to download

::@echo off

set sURL=http://www.archive.org/download/httparchive_downloads_Nov_15_2013/httparchive_Nov_15_2013_requests.csv.gz

::----------------------------------------------------------------------------------------------
set exePath=.\resource
set csvHeader=requestid,pageid,startedDateTime,time,method,url,urlShort,redirectUrl,firstReq,firstHtml,reqHttpVersion,reqHeadersSize,reqBodySize,reqCookieLen,reqOtherHeaders,status,respHttpVersion,respHeadersSize,respBodySize,respSize,respCookieLen,expAge,mimeType,respOtherHeaders,req_accept,req_accept_charset,req_accept_encoding,req_accept_language,req_connection,req_host,req_if_modified_since,req_if_none_match,req_referer,req_user_agent,resp_accept_ranges,resp_age,resp_cache_control,resp_connection,resp_content_encoding,resp_content_language,resp_content_length,resp_content_location,resp_content_type,resp_date,resp_etag,resp_expires,resp_keep_alive,resp_last_modified,resp_location,resp_pragma,resp_server,resp_transfer_encoding,resp_vary,resp_via,resp_x_powered_by
::----------------------------------------------------------------------------------------------

::cleanup
del /f /q requests.gz
del /f /q requests.txt
del /f /q requests_fixed.txt


::download 300kb ~~ 2.5mb txt file ~~ 2570 requests (line separated)
%exePath%\curl.exe --location-trusted --output requests.gz --range 0-307200 "%sURL%"

::un-gz
%exePath%\7z.exe x requests.gz >nul

::cleanup
del /f /q requests.gz

::fix - remove last line since it will be corrupted 
%exePath%\head.exe -n -1 requests.txt > requests_fixed.txt

::cleanup
del /f /q requests.txt

echo %csvHeader% >requests.txt
%exePath%\cat.exe requests_fixed.txt >>requests.txt

::cleanup
del /f /q requests_fixed.txt

how to parse

#!/bin/bash
INPUT=requests.txt
OLDIFS=$IFS
IFS=,
[ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; }
while read method url redirectUrl
do
        echo "method : $method"
        echo "url : $url"
        echo "redirectUrl : $redirectUrl"
done < $INPUT
IFS=$OLDIFS

y2013m12d10 update:
download schema from http://httparchive.org/downloads/httparchive_schema.sql
extract requests table creation script:

CREATE TABLE `requests` (
  `requestid` INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
  `pageid` INT(10) UNSIGNED NOT NULL,
  `startedDateTime` INT(10) UNSIGNED DEFAULT NULL,
  `time` INT(10) UNSIGNED DEFAULT NULL,
  `method` VARCHAR(32) DEFAULT NULL,
  `url` TEXT,
  `urlShort` VARCHAR(255) DEFAULT NULL,
  `redirectUrl` TEXT,
  `firstReq` TINYINT(1) NOT NULL,
  `firstHtml` TINYINT(1) NOT NULL,
  `reqHttpVersion` VARCHAR(32) DEFAULT NULL,
  `reqHeadersSize` INT(10) UNSIGNED DEFAULT NULL,
  `reqBodySize` INT(10) UNSIGNED DEFAULT NULL,
  `reqCookieLen` INT(10) UNSIGNED NOT NULL,
  `reqOtherHeaders` TEXT,
  `status` INT(10) UNSIGNED DEFAULT NULL,
  `respHttpVersion` VARCHAR(32) DEFAULT NULL,
  `respHeadersSize` INT(10) UNSIGNED DEFAULT NULL,
  `respBodySize` INT(10) UNSIGNED DEFAULT NULL,
  `respSize` INT(10) UNSIGNED DEFAULT NULL,
  `respCookieLen` INT(10) UNSIGNED NOT NULL,
  `expAge` INT(10) UNSIGNED NOT NULL,
  `mimeType` VARCHAR(255) DEFAULT NULL,
  `respOtherHeaders` TEXT,
  `req_accept` VARCHAR(255) DEFAULT NULL,
  `req_accept_charset` VARCHAR(255) DEFAULT NULL,
  `req_accept_encoding` VARCHAR(255) DEFAULT NULL,
  `req_accept_language` VARCHAR(255) DEFAULT NULL,
  `req_connection` VARCHAR(255) DEFAULT NULL,
  `req_host` VARCHAR(255) DEFAULT NULL,
  `req_if_modified_since` VARCHAR(255) DEFAULT NULL,
  `req_if_none_match` VARCHAR(255) DEFAULT NULL,
  `req_referer` VARCHAR(255) DEFAULT NULL,
  `req_user_agent` VARCHAR(255) DEFAULT NULL,
  `resp_accept_ranges` VARCHAR(255) DEFAULT NULL,
  `resp_age` VARCHAR(255) DEFAULT NULL,
  `resp_cache_control` VARCHAR(255) DEFAULT NULL,
  `resp_connection` VARCHAR(255) DEFAULT NULL,
  `resp_content_encoding` VARCHAR(255) DEFAULT NULL,
  `resp_content_language` VARCHAR(255) DEFAULT NULL,
  `resp_content_length` VARCHAR(255) DEFAULT NULL,
  `resp_content_location` VARCHAR(255) DEFAULT NULL,
  `resp_content_type` VARCHAR(255) DEFAULT NULL,
  `resp_date` VARCHAR(255) DEFAULT NULL,
  `resp_etag` VARCHAR(255) DEFAULT NULL,
  `resp_expires` VARCHAR(255) DEFAULT NULL,
  `resp_keep_alive` VARCHAR(255) DEFAULT NULL,
  `resp_last_modified` VARCHAR(255) DEFAULT NULL,
  `resp_location` VARCHAR(255) DEFAULT NULL,
  `resp_pragma` VARCHAR(255) DEFAULT NULL,
  `resp_server` VARCHAR(255) DEFAULT NULL,
  `resp_transfer_encoding` VARCHAR(255) DEFAULT NULL,
  `resp_vary` VARCHAR(255) DEFAULT NULL,
  `resp_via` VARCHAR(255) DEFAULT NULL,
  `resp_x_powered_by` VARCHAR(255) DEFAULT NULL,
  PRIMARY KEY (`requestid`),
  UNIQUE KEY `startedDateTime` (`startedDateTime`,`pageid`,`urlShort`),
  KEY `pageid` (`pageid`)
) ENGINE=MYISAM AUTO_INCREMENT=1070623634 DEFAULT CHARSET=latin1;

run query to create table and then export it to CSV (table is empty..) to get all the headers in easy copy&&paste format.

requestid,pageid,startedDateTime,time,method,url,urlShort,redirectUrl,firstReq,firstHtml,reqHttpVersion,reqHeadersSize,reqBodySize,reqCookieLen,reqOtherHeaders,status,respHttpVersion,respHeadersSize,respBodySize,respSize,respCookieLen,expAge,mimeType,respOtherHeaders,req_accept,req_accept_charset,req_accept_encoding,req_accept_language,req_connection,req_host,req_if_modified_since,req_if_none_match,req_referer,req_user_agent,resp_accept_ranges,resp_age,resp_cache_control,resp_connection,resp_content_encoding,resp_content_language,resp_content_length,resp_content_location,resp_content_type,resp_date,resp_etag,resp_expires,resp_keep_alive,resp_last_modified,resp_location,resp_pragma,resp_server,resp_transfer_encoding,resp_vary,resp_via,resp_x_powered_by

download to desktop and extract the entire archive of requests for December
http://www.archive.org/download/httparchive_downloads_Dec_1_2013/httparchive_Dec_1_2013_requests.csv.gz
rename requests.txt to httparchive_Dec_1_2013_requests.csv

(download of 2.23GB, at rate of about 1.5MB, will take about 33 minutes).

use cygwin (or win32 compiled fgrep, by cygwin is better here..)
and run the following command to extract only the lines (this file is line based) of the hosts we are interested in:
**note the "DESKTOP" location

LC_ALL=C fgrep 'tuoitre.vn' "c:/users/karako/Desktop/httparchive_Dec_1_2013_requests.csv" >c:/users/karako/Desktop/httparchive_Dec_1_2013_requests__tuoitrevn.csv

now you'll have smaller CSV, note that the original csv file nor this will not have headers in it.
you might want to copy and paste it to the first line (for any reason)..

this is the result CSV contains only the host 'tuoitre.vn', with headers..
httparchive_Dec_1_2013_requests__tuoitrevn.csv

now you can run the shell script to map it to a cURL request,
(assumes the CSV file is on your desktop)

(parse.sh content:)

#!/bin/bash
INPUT=c:/users/karako/Desktop/httparchive_Dec_1_2013_requests.csv
OLDIFS=$IFS
IFS=,
[ ! -f $INPUT ] && { echo "$INPUT file not found"; exit 99; }
while read method url redirectUrl
do
    echo "method : $method"
    echo "url : $url"
    echo "redirectUrl : $redirectUrl"
    echo "reqOtherHeaders : $reqOtherHeaders"
    echo "req_user_agent : $req_user_agent"
    echo "---------------------------------------------------"
done < $INPUT
IFS=$OLDIFS

*don't forget..

chmod u+x parse.sh
./parse.sh

*** update
output example:

---------------------------------------------------
method : "1045176502"
url : "12623315"
redirectUrl : "1385861436","191","GET","https://yolacom.yolacdn.net/assets/img/slider/07_blueprintforjonas_tn.jpg","https://yolacom.yolacdn.net/assets/img/slider/07_blueprintforjonas_tn.jpg",N,"0","0","1.1",N,N,"0","X-Download-Initiator = image="doc F300 win 1458; image; src; enter tree"","200","1.1",N,"23961","23961","0","1","image/jpeg","Yola-ID = web1 D=199 t=1385159099295583","image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5",N,"gzip, deflate","en-US","Keep-Alive","yolacom.yolacdn.net",N,N,"https://www.yola.com/","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; PTST 2.361)","bytes",N,"max-age=1","keep-alive",N,N,"23961",N,"image/jpeg","Sun, 01 Dec 2013 09:30:37 GMT",""6a7d71-5d99-4ebc7b1028f80"","Sun, 01 Dec 2013 09:30:38 GMT",N,"Fri, 22 Nov 2013 17:53:27 GMT",N,N,"Apache",N,N,N,N
reqOtherHeaders :
req_user_agent :
---------------------------------------------------
method : "1045176504"
url : "12623315"
redirectUrl : "1385861436","338","GET","https://yolacom.yolacdn.net/assets/img/slider/08_bodybar_tn.jpg","https://yolacom.yolacdn.net/assets/img/slider/08_bodybar_tn.jpg",N,"0","0","1.1",N,N,"0","X-Download-Initiator = image="doc F300 win 1458; image; src; enter tree"","200","1.1",N,"22489","22489","0","1","image/jpeg","Yola-ID = web2 D=271 t=1385685922213243","image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5",N,"gzip, deflate","en-US","Keep-Alive","yolacom.yolacdn.net",N,N,"https://www.yola.com/","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; PTST 2.361)","bytes",N,"max-age=1","keep-alive",N,N,"22489",N,"image/jpeg","Sun, 01 Dec 2013 09:30:37 GMT",""ae7ce9-57d9-4ec37cb0da71e"","Sun, 01 Dec 2013 09:30:38 GMT",N,"Thu, 28 Nov 2013 07:38:00 GMT",N,N,"Apache",N,N,N,N
reqOtherHeaders :
req_user_agent :
---------------------------------------------------
method : "1045176506"
url : "12623315"
redirectUrl : "1385861436","336","GET","https://yolacom.yolacdn.net/assets/img/slider/09_channysphotography_tn.jpg","https://yolacom.yolacdn.net/assets/img/slider/09_channysphotography_tn.jpg",N,"0","0","1.1",N,N,"0","X-Download-Initiator = image="doc F300 win 1458; image; src; enter tree"","200","1.1",N,"30751","30751","0","1","image/jpeg","Yola-ID = web1 D=297 t=1384009469773430","image/png, image/svg+xml, image/*;q=0.8, */*;q=0.5",N,"gzip, deflate","en-US","Keep-Alive","yolacom.yolacdn.net",N,N,"https://www.yola.com/","Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; PTST 2.361)","bytes",N,"max-age=1","keep-alive",N,N,"30751",N,"image/jpeg","Sun, 01 Dec 2013 09:30:37 GMT",""54ef65-781f-4eab2bf29513e"","Sun, 01 Dec 2013 09:30:38 GMT",N,"Fri, 08 Nov 2013 23:28:59 GMT",N,N,"Apache",N,N,N,N
reqOtherHeaders :
req_user_agent :
---------------------------------------------------

as you can see it seems to be a little shift (>>>>?) from the schema, to the actual CSV export data,
but thats not too bad, feel free to comment on the change needed..

y2013m12d11 update:
--------------------

place all of the gzip files in a folder ("res")

cd ./res/
zcat *.gz | LC_ALL=C fgrep --fixed-strings 'tuoitre.vn' >requests_with_tuoitrevn.csv

this one gave me an idea!
http://drjohnstechtalk.com/blog/2011/06/gnu-parallel-really-helps-with-zcat/

cd ./res/
ls *gz | parallel -k "zcat {} | LC_ALL=C fgrep --fixed-strings 'tuoitre.vn'" >requests_with_tuoitrevn.csv

time it!!
see how much cpu is used
on a dummy (for testing) content
zcat *.gz | time LC_ALL=C fgrep --fixed-strings 'tuoitre.vn'>requests_with_tuoitrevn.csv
--->0.00user 0.00system 0:00.01elapsed 0%CPU (0avgtext+0avgdata 194560maxresident)k 0inputs+0outputs (782major+0minor)pagefaults 0swaps

ls *gz | time parallel -k "zcat {} | LC_ALL=C fgrep --fixed-strings 'tuoitre.vn'">requests_with_tuoitrevn.csv
---->0.20user 0.13system 0:00.41elapsed 80%CPU (0avgtext+0avgdata 7208960maxresident)k 0inputs+0outputs (28944major+0minor)pagefaults 0swaps

hmmm... maybe just using months first (multi cygwin shell for each month):
zcat httparchive_Dec_1_2013_requests.csv.gz | grep '.tase.co.il'>2013_Dec__requests_with_tasecoil.csv

update y2013m12d11:

----------------------------------------------------------------------------------------------------------------------------------
0.1 download csv.gz archive for example:
http://ia601504.us.archive.org/14/items/httparchive_downloads/httparchive_Apr_1_2013_requests.csv.gz

0.2 rename it from:

httparchive_Apr_1_2013_requests.csv.gz

to:

y2013_m04_d01.csv.gz

0.3 do it for as much archives as possible.

0.4 create c:/work/trunk/massRequest
0.5 create c:/work/trunk/massRequest/res
0.6 create c:/work/trunk/massRequest/1

0.7 place all csv.gz files in c:/work/trunk/massRequest/1

--------------------------

1. place just 1 file in the c:/work/trunk/massRequest/res
for example:

y2012_m12_d01.csv.gz

2. open five cygwin shell window since we are going to run multi process for each host.
the hosts we are looking for:

vmware.com
cars.com
.rbs.com
tuoitre.vn
tase.co.il

3. on each cygwin window run:

cd c:/work/trunk/massRequest/res

4. one one command on each window:

ls *gz | time parallel -k "zcat {} | LC_ALL=C fgrep 'vmware.com'" >requests_with_vmwarecom.csv
ls *gz | time parallel -k "zcat {} | LC_ALL=C fgrep 'cars.com'" >requests_with_carscom.csv
ls *gz | time parallel -k "zcat {} | LC_ALL=C fgrep '.rbs.com'" >requests_with_rbscom.csv
ls *gz | time parallel -k "zcat {} | LC_ALL=C fgrep 'tuoitre.vn'" >requests_with_tuoitrevn.csv
ls *gz | time parallel -k "zcat {} | LC_ALL=C fgrep 'tase.co.il'" >requests_with_tasecoil.csv

5. when done, add prefix to each result csv file:
for example:
you run this command:

        ls *gz | time parallel -k "zcat {} | LC_ALL=C fgrep 'vmware.com'" >requests_with_vmwarecom.csv

the file in the folder was:
y2013_m01_d01.csv.gz

--->rename

            requests_with_vmwarecom.csv

to

            y2013_m01_d01__requests_with_vmwarecom.csv

6. move all files to ./done/ folder, marking all this month done.

y2013_m01_d01.csv.gz
y2013_m01_d01__requests_with_carscom.csv
y2013_m01_d01__requests_with_rbscom.csv
y2013_m01_d01__requests_with_tasecoil.csv
y2013_m01_d01__requests_with_tuoitrevn.csv
y2013_m01_d01__requests_with_vmwarecom.csv

7. move a new gz file from folder ./1/ into the res folder.
8. on each cygwin window, press {UP} key and enter to run the command again,
9. keep downloading new csv.gz files on 1st and 15th on each month.

-------------------------------------------------------------------------------------------
since this process is massive on the CPU
and the download of all the archives takes a long time,
we only filter out lines that does not have our host name in it,

a secondary filter (on the smaller output files) is needed,
there we should check that the *request* (url of the site)- is the one in that line.
-------------------------------------------------------------------------------------------

-----------------------------------------------------------------------------------------------------------------------------

Parallel cUrl Requests

Multi-processing the cUrl execute (on windows) can enable a very easy and fast way to roll a small-scale load/traffic test on a resource,
Although there are probably better ways (multi curl lib-python, curl_multi-php), this kind of “all include” one nice solution.

as a bonus, there is some xff (X-Forward-For) header input from text file.

the sample project is attached as rar, its open, free to use (under GPL).

-pre:
— get a compiled curl (w or w/o libssl32.dll, 32bit or 64bit, it really does not matter)

—request.cmd
—————————–
@echo off

set sUserAgent=”Mozilla/5.0 (Windows NT 5.2; WOW64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.89 Safari/537.1″
set sHOST=%1
set sIP=%2

::.\curl.exe –header “X-Forwarded-For: %sIP%” –user-agent %sUserAgent% –referer %sHOST% –verbose %sHOST%
.\curl.exe –header “X-Forwarded-For: %sIP%” –user-agent %sUserAgent% –referer %sHOST% –silent %sHOST%

exit

—run20.cmd
—————————-
@echo off
echo I am the runner

set HOST=__host.txt
set IP=__country_Afghanistan_20.txt

for /f “tokens=* delims= ” %%a in (%HOST%) do (
for /f “tokens=* delims= ” %%b in (%IP%) do (
start /b /min /low “cmd /c “call request.cmd %%a %%b
)
)
exit

—__country_Afghanistan_20.txt
210.80.29.156
175.106.44.147
210.80.20.153
111.125.156.144
121.127.53.184
203.215.38.139
180.94.76.33
202.56.178.46
180.94.95.100
58.147.130.55
58.147.144.53
121.127.38.249
210.80.27.69
125.213.222.141
210.80.7.11
125.213.218.47
58.147.153.70
180.222.137.96
180.94.65.4
175.106.40.162

—__host.txt
http://172.29.42.113/blank2.html

 

Leave a reply