|
1 |
| -# GetJS |
2 |
| -[](https://opensource.org/licenses/MIT) |
3 |
| -[](https://github.com/003random/getJS/issues) |
| 1 | +<h2 align="center">JavaScript Extraction CLI & Package</h2> |
| 2 | +<p align="center"> |
| 3 | + <a href="https://pkg.go.dev/github.com/003random/getJS"> |
| 4 | + <img src="https://pkg.go.dev/badge/github.com/003random/getJS"> |
| 5 | + </a> |
| 6 | + <a href="https://github.com/003random/getJS/releases"> |
| 7 | + <img src="https://img.shields.io/github/release/003random/getJS.svg"> |
| 8 | + </a> |
| 9 | + <a href="https://github.com/003random/getJS/blob/master/LICENSE"> |
| 10 | + <img src="https://img.shields.io/badge/license-MIT-blue.svg"> |
| 11 | + </a> |
| 12 | +</p> |
4 | 13 |
|
5 |
| -getJS is a tool to extract all the javascript files from a set of given urls. |
6 | 14 |
|
7 |
| -The urls can also be piped to getJS, or you can specify a singel url with the -url argument. getJS offers a range of options, |
| 15 | +[getJS](https://github.com/003random/getJS) is a versatile tool designed to extract JavaScript sources from web pages. It offers both a command-line interface (CLI) for straightforward URL processing and a package interface for more customized integrations. |
8 | 16 |
|
9 |
| -varying from completing the urls, to resolving the files. |
| 17 | +## Table of Contents |
10 | 18 |
|
11 |
| -## Prerequisites |
| 19 | +- [Installation](#installation) |
| 20 | +- [CLI Usage](#cli-usage) |
| 21 | + - [Options](#options) |
| 22 | + - [Examples](#examples) |
| 23 | +- [Package Usage](#package-usage) |
| 24 | + - [Importing the Extractor](#importing-the-extractor) |
| 25 | + - [Example](#example) |
| 26 | +- [Version Information](#version-information) |
| 27 | +- [Contributing](#contributing) |
| 28 | +- [License](#license) |
12 | 29 |
|
13 |
| -Make sure you have [GO](https://golang.org/) installed on your system. |
| 30 | +## Installation |
14 | 31 |
|
15 |
| -### Installing |
| 32 | +To install `getJS`, use the following command: |
16 | 33 |
|
17 |
| -getJS is written in GO. You can install it with `go get`: |
| 34 | +`go get github.com/003random/getJS` |
18 | 35 |
|
19 |
| -``` |
20 |
| -go install github.com/003random/getJS@latest |
21 |
| -``` |
| 36 | +## CLI Usage |
22 | 37 |
|
23 |
| -# Usage |
24 |
| -Note: When you supply urls from different sources, e.g. with stdin and an input file, it will add all the urls together :) |
25 |
| -Example: `echo "https://github.com" | getJS --url https://example.com --input domains.txt` |
26 |
| - |
27 |
| -To get all options, do: |
28 |
| -```bash |
29 |
| -getJS -h |
30 |
| -``` |
31 |
| - |
32 |
| - |
33 |
| -| Flag | Description | Example | |
34 |
| -|------|-------------|---------| |
35 |
| -| --url | The url to get the javascript sources from | getJS --url https://poc-server.com | |
36 |
| -| --method | The request method. e.g. POST or GET. Default: "GET"| getJS --url https://poc-server.com --method POST | |
37 |
| -| --timeout | The request timeout. Default: 10 (secs) | getJS --url https://poc-server.com --timeout 15 | |
38 |
| -| --insecure | Skip SSL certificate verification. Use when the cert is expired or invalid | getJS --url https://poc-server.com --insecure | |
39 |
| -| --header | Custom request header(s) | getJS --url https://poc-server.com --header "Authorization: Bearer token" | |
40 |
| -| --input | Input file with urls | getJS --input domains.txt | |
41 |
| -| --output | The file where to save the output to | getJS --output output.txt | |
42 |
| -| --verbose | Display info of what is going on | getJS --verbose | |
43 |
| -| --complete | Complete the urls. e.g. /js/index.js -> htt<span></span>ps://example.<span></span>com/js/index.js | getJS --complete | |
44 |
| -| --resolve | Resolve the output and filter out the non existing files (Can only be used in combination with --complete) | getJS --complete --resolve | |
45 |
| -| --nocolors | Don't color the output | getJS --nocolors | |
46 |
| - |
47 |
| -## Examples |
48 |
| - |
49 |
| -  |
50 |
| - |
51 |
| - |
52 |
| -getJS supports stdin data. To pipe urls to getJS, use the following: |
53 |
| - |
54 |
| -```bash |
55 |
| -$ cat domains.txt | getJS |
56 |
| -``` |
57 |
| - |
58 |
| -To save the js files, you can use: |
59 |
| -```bash |
60 |
| -$ getJS --complete --url https://poc-server.com | xargs wget |
| 38 | +### Options |
| 39 | + |
| 40 | +`getJS` provides several command-line options to customize its behavior: |
| 41 | + |
| 42 | +- `-url string`: The URL from which JavaScript sources should be extracted. |
| 43 | +- `-input string`: Optional URLs input files. Each URL should be on a new line in plain text format. Can be used multiple times. |
| 44 | +- `-output string`: Optional output file where results are written to. Can be used multiple times. |
| 45 | +- `-complete`: Complete/Autofill relative URLs by adding the current origin. |
| 46 | +- `-resolve`: Resolve the JavaScript files. Can only be used in combination with `--complete`. |
| 47 | +- `-threads int`: The number of processing threads to spawn (default: 2). |
| 48 | +- `-verbose`: Print verbose runtime information and errors. |
| 49 | +- `-method string`: The request method used to fetch remote contents (default: "GET"). |
| 50 | +- `-header string`: Optional request headers to add to the requests. Can be used multiple times. |
| 51 | +- `-timeout duration`: The request timeout while fetching remote contents (default: 5s). |
| 52 | + |
| 53 | +### Examples |
| 54 | + |
| 55 | +#### Extracting JavaScript from a Single URL |
| 56 | + |
| 57 | +`getJS -url https://destroy.ai` |
| 58 | + |
| 59 | +or |
| 60 | + |
| 61 | +`curl https://destroy.ai | getJS` |
| 62 | + |
| 63 | +#### Using Custom Request Options |
| 64 | + |
| 65 | +`getJS -url "http://example.com" -header "User-Agent: foo bar" -method POST --timeout=15s` |
| 66 | + |
| 67 | +#### Processing Multiple URLs from a File |
| 68 | + |
| 69 | +`getJS -input foo.txt -input bar.txt` |
| 70 | + |
| 71 | +#### Saving Results to an Output File |
| 72 | + |
| 73 | +`getJS -url "http://example.com" -output results.txt` |
| 74 | + |
| 75 | +## Package Usage |
| 76 | + |
| 77 | +### Importing the Extractor |
| 78 | + |
| 79 | +To use `getJS` as a package, you need to import the `extractor` package and utilize its functions directly. |
| 80 | + |
| 81 | +### Example |
| 82 | + |
| 83 | +```Go |
| 84 | +package main |
| 85 | + |
| 86 | +import ( |
| 87 | + "fmt" |
| 88 | + "log" |
| 89 | + "net/http" |
| 90 | + "net/url" |
| 91 | + |
| 92 | + "github.com/003random/getJS/extractor" |
| 93 | +) |
| 94 | + |
| 95 | +func main() { |
| 96 | + baseURL, err := url.Parse("https://google.com") |
| 97 | + if (err != nil) { |
| 98 | + log.Fatalf("Error parsing base URL: %v", err) |
| 99 | + } |
| 100 | + |
| 101 | + resp, err := extractor.FetchResponse(baseURL.String(), "GET", http.Header{}) |
| 102 | + if (err != nil) { |
| 103 | + log.Fatalf("Error fetching response: %v", err) |
| 104 | + } |
| 105 | + defer resp.Body.Close() |
| 106 | + |
| 107 | + // Custom extraction points (optional). |
| 108 | + extractionPoints := map[string][]string{ |
| 109 | + "script": {"src", "data-src"}, |
| 110 | + "a": {"href"}, |
| 111 | + } |
| 112 | + |
| 113 | + sources, err := extractor.ExtractSources(resp.Body, extractionPoints) |
| 114 | + if (err != nil) { |
| 115 | + log.Fatalf("Error extracting sources: %v", err) |
| 116 | + } |
| 117 | + |
| 118 | + // Filtering and extending extracted sources. |
| 119 | + filtered, err := extractor.Filter(sources, extractor.WithComplete(baseURL), extractor.WithResolve()) |
| 120 | + if (err != nil) { |
| 121 | + log.Fatalf("Error filtering sources: %v", err) |
| 122 | + } |
| 123 | + |
| 124 | + for source := range filtered { |
| 125 | + fmt.Println(source.String()) |
| 126 | + } |
| 127 | +} |
61 | 128 | ```
|
62 |
| - |
63 |
| -If you would like the output to be in JSON format, you can combine it with [@Tomnomnom's](https://github.com/tomnomnom) [toJSON](https://github.com/tomnomnom/hacks/tree/master/tojson): |
64 |
| -```bash |
65 |
| -$ getJS --url https://poc-server.com | tojson |
66 |
| -``` |
67 |
| - |
68 |
| -To feed urls from a file use: |
69 |
| -```bash |
70 |
| -$ getJS --input domains.txt |
71 |
| -``` |
72 |
| - |
73 |
| -To save the results to a file, and don't display anything, use: |
74 |
| -```bash |
75 |
| -$ getJS --url https://poc-server.com --output results.txt |
76 |
| -``` |
77 |
| - |
78 |
| -If you want to have a list of full urls as output use: |
79 |
| -```bash |
80 |
| -$ getJS --url domains.txt -complete |
81 |
| -``` |
82 |
| - |
83 |
| -If you want to only show the existing js files, use: |
84 |
| -```bash |
85 |
| -$ getJS --url domains.txt --complete --resolve |
86 |
| -``` |
87 |
| - |
88 |
| -## Built With |
89 |
| - |
90 |
| -* [GO](http://golang.org/) - GOlanguage |
91 |
| -* [Goquery](https://github.com/PuerkitoBio/goquery) - HTML parser with syntaxes like jquery, in GO |
92 | 129 |
|
| 130 | +## Version Information |
| 131 | + |
| 132 | +This is the v2 version of `getJS`. The original version can be found under the tag [v1](https://github.com/003random/getJS/tree/v1). |
93 | 133 |
|
94 | 134 | ## Contributing
|
95 | 135 |
|
96 |
| -You are free to submit any issues and/or pull requests :) |
| 136 | +Contributions are welcome! Please open an issue or submit a pull request for any bugs, feature requests, or improvements. |
97 | 137 |
|
98 | 138 | ## License
|
99 | 139 |
|
100 |
| -This project is licensed under the MIT License. |
101 |
| - |
102 |
| -## Acknowledgments |
103 |
| - |
104 |
| -* [@jimen0](https://github.com/jimen0) for helping getting me started with GO |
105 |
| - |
106 |
| - |
107 |
| ---- |
108 |
| - |
109 |
| -*This is my first tool written in GO. I created it to learn the language more. (useful feeback is always welcome!)* |
| 140 | +This project is licensed under the MIT License. See the [LICENSE](https://github.com/003random/getJS/blob/master/LICENSE) file for details. |
0 commit comments