• Products

    Access

    Improve user conversion, go passwordless.

    Gate

    The identity-aware edge authorizer.

    Data Vault

    Serverless-ready secure PII user data storage.

  • Use cases

    Product

    Onboarding

    Analytics & Screening

    Fraud & Security

    Engineering

    Edge Authentication

    Application Access Plane

  • Developers
  • Blog
  • Gallery

    E-commerce Playground

    Build a passwordless eCommerce experience backed by SlashID.

    Multi-tenant App

    Build a multi-tenant app using SlashID.

Log in
Get started

Products

Access

Gate

Data Vault

Use cases

Onboarding

Analytics & Screening

Fraud & Security

Edge Authentication

Application Access Plane

Developers

Blog

Gallery

E-commerce Playground

Multi-tenant App

Log in
Get started
Tutorial/5 Sep, 2023

Protecting Exposed APIs: Avoid Data Leaks with SlashID Gate and OPA

Adequately protecting APIs is key to avoid data leaks and breaches.

Just recently, an exposed API allowed an attacker to scrape over 2.6 million records from Duolingo.

In this article, we’ll show how you can use Gate to detect, respond to, and prevent these kinds of incidents.


How did the Duolingo leak happen?

In March, Ivano Somaini wrote a tweet disclosing an unauthenticated Duolingo API as part of his Open Source Intelligence (OSINT) work.

The issue is pretty straightforward. A simple API call to the https://www.duolingo.com/2017-06-30/users?email endpoint reveals several private details about users and allows attackers to enumerate registered emails. Below an example output:

{
  "users": [
    {
      "joinedClassroomIds": [],
      "streak": 0,
      "motivation": "none",
      "acquisitionSurveyReason": "none",
      "shouldForceConnectPhoneNumber": false,
      "picture": "//simg-ssl.duolingo.com/avatar/default_2",
      "learningLanguage": "ru",
      "hasFacebookId": false,
      "shakeToReportEnabled": null,
      "liveOpsFeatures": [
        {
          "startTimestamp": 1693007940,
          "type": "TIMED_PRACTICE",
          "endTimestamp": 1693180740
        }
      ],
      "canUseModerationTools": false,
      "id": 184078602543312,
      "betaStatus": "INELIGIBLE",
      "hasGoogleId": false,
      "privacySettings": [],
      "fromLanguage": "en",
      "hasRecentActivity15": false,
      "_achievements": [],
      "observedClassroomIds": [],
      "username": "example",
      "bio": "",
      "profileCountry": "US",
      "chinaUserModerationRecords": [],
      "globalAmbassadorStatus": {},
      "currentCourseId": "DUOLINGO_RU_EN",
      "hasPhoneNumber": false,
      "creationDate": 146229322008,
      "achievements": [],
      "hasPlus": false,
      "name": "o",
      "roles": ["users"],
      "classroomLeaderboardsEnabled": false,
      "emailVerified": false,
      "courses": [
        {
          "preload": false,
          "placementTestAvailable": false,
          "authorId": "duolingo",
          "title": "Russian",
          "learningLanguage": "ru",
          "xp": 370,
          "healthEnabled": true,
          "fromLanguage": "en",
          "crowns": 7,
          "id": "DUOLINGO_RU_EN"
        }
      ],
      "totalXp": 370,
      "streakData": {
        "currentStreak": null
      }
    }
  ]
}

Armed with this API, an attacker published a dump of 2.6 million user records on VX-Underground.

This kind of incident is far from isolated, and Duolingo is just one of the many examples. In a similar incident in 2021, the “Add Friend” API allowed linking phone numbers to user accounts, costing Facebook over $275 million in fines from the Irish Data Protection Commission.

Introducing Gate

At SlashID, we believe that security begins with Identity. Gate is our identity-aware edge authorizer to protect APIs and workloads.

Gate can be used to monitor or enforce authentication, authorization and identity-based rate limiting on APIs and workloads, as well as to detect, anonymize, or block personally identifiable information (PII) exposed through your APIs or workloads.

Read on to learn how to deploy Gate to prevent data breaches like the ones mentioned above.

Deploying Gate

Gate can be deployed in multiple ways: as a sidecar for your service, as an external authorizer for Envoy, an ingress proxy or a plugin for your favorite API Gateway. See more in the Gate configuration docs.

For this toy example we chose a simple Docker Compose deployment, which looks like this:

version: '3.7'

services:
  backend:
    build: backend
    ports:
      - 8000:8000
    environment:
      - PORT=8000
    env_file:
      - envs/env.env
    restart: on-failure

  gate:
    image: slashid/gate:latest
    volumes:
      - ./gate.yaml:/gate/gate.yaml
    ports:
      - 8080:8080
    env_file:
      - envs/env.env
    command: --yaml /gate/gate.yaml
    restart: on-failure
    depends_on:
      - backend

The Docker Compose spawns two services: Gate and a toy backend.

Simulating the leaky API

Our toy backend contains a REST API that behaves similarly to the Duolingo one:

users = [
    {'email': 'test@example.com', 'name': 'Test User', 'id': 1},
    {'email': 'john@example.com', 'name': 'John Doe', 'id': 2},
    # ... add more users if needed
]

def get_user_by_email(email: str) -> Optional[dict]:
    for user in users:
        if user['email'] == email:
            return user
    return None

@app.get("/get_user/", tags=["business"])
async def read_user(email: str = Query(..., description="The email of the user to search for")):
    user = get_user_by_email(email)
    if user:
        return user
    else:
        raise HTTPException(status_code=404, detail="User not found")

Let’s test it:

curl 'http://gate:8080/get_user/?email=test@example.com' | jq
{
  "email": "test@example.com",
  "name": "Test User",
  "id": 1
}

Detecting PII data through Gate

Gate has a plugin-based architecture and we expose several built-in plugins. In particular, the PII Anonymizer plugin allows the detection and anonymization of PII or other sensitive data.

The PII Anonymizer plugin can be configured to exclusively monitor PII (as opposed to editing the traffic) by setting the anonymizers rule to keep. We’ll show an example in the next section.

Let’s see a simple Gate configuration that detects email addresses and rewrites the HTTP response to anonymize the field with a hash of the email address:

gate:
  port: 8080
  log:
    format: text
    level: info

  plugins_http_cache:
    - pattern: '*'
      cache_control_override: private, max-age=600, stale-while-revalidate=300

  plugins:
    - id: pii_anonymizer
      type: anonymizer
      enabled: false
      intercept: request_response
      parameters:
        anonymizers: |
          EMAIL_ADDRESS:
            type: hash

  urls:
    - pattern: '*/get_user'
      target: http://backend:8000
      plugins:
        pii_anonymizer:
          enabled: true

Let’s test it:

curl 'http://gate:8080/api/get_user/?email=test@example.com' | jq
{
  "email": "973dfe463ec85785f5f95af5ba3906eedb2d931c24e69824a89ea65dba4e813b",
  "id": 1,
  "name": "Test User"
}

Detecting PII and blocking the request with OPA

Note: similarly to the PII detection plugin, the OPA plugin can also be run in monitoring mode. See the end of the blogpost to find out more.

Sometimes hashing the request is not enough and you want to block it entirely, let’s see how to combine the PII detection plugin with the OPA plugin to detect and block requests containing PII data.

Note: In the examples below we embed the OPA policies directly in the Gate config but they can also be served through a bundle, please check out our documentation to learn more about the plugin.

gate:
  port: 8080
  log:
    format: text
    level: info

  plugins_http_cache:
    - pattern: '*'
      cache_control_override: private, max-age=600, stale-while-revalidate=300

  plugins:
    - id: authz_deny_pii
      type: opa
      enabled: false
      intercept: response
      parameters:
        <<: *slashid_config
        policy_decision_path: /authz/allow
        policy: |
          package authz

          import future.keywords.if

          default allow := false

          no_key_found(obj, key) {
            not obj[key]
          }

          allow if no_key_found(input.response.http.headers,  "X-Gate-Anonymize-1")

    - id: pii_anonymizer
      type: anonymizer
      enabled: false
      intercept: request_response
      parameters:
        anonymizers: |
          DEFAULT:
            type: keep
  urls:
    - pattern: '*/get_user'
      target: http://backend:8000
      plugins:
        pii_anonymizer:
          enabled: true
        authz_deny_pii:
          enabled: true

The authz_deny_pii instance of the OPA plugin enforces an OPA policy that blocks a request if the response contains a X-Gate-Anonymize-1. This is a header added by the PII detection plugin to notify of the presence of PII.

Let’s see it in action:

/usr/server/app $ curl --verbose 'http://gate:8080/api/get_user/?email=test@example.com' | jq
* processing: http://gate:8080/api/get_user/?email=test@example.com
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 172.27.0.5:8080...
* Connected to gate (172.27.0.5) port 8080
> GET /api/get_user/?email=test@example.com HTTP/1.1
> Host: gate:8080
> User-Agent: curl/8.2.1
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Cache-Control: private, max-age=600, stale-while-revalidate=300
< Content-Length: 0
< Content-Type: application/json
< Date: Sat, 02 Sep 2023 13:58:00 GMT
< Server: uvicorn
< Via: 1.0 gate
< X-Gate-Anonymize-1: $.body.email 0 64 EMAIL_ADDRESS
<
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
* Connection #0 to host gate left intact

Note that in this example pii_anonymizer is set to monitoring mode: type: keep for all PII types (DEFAULT). The plugin allows PII to pass through unchanged, without replacing it with an anonymized version of the data or changing the traffic in any way.

- id: pii_anonymizer
  type: anonymizer
  enabled: false
  intercept: request_response
  parameters:
    anonymizers: |
      DEFAULT:
        type: keep

Differential policy enforcement for authenticated users

Let’s now enforce a new OPA policy that blocks requests containing PII only if the user is not authenticated, while allowing PII in requests of authenticated users.

For simplicity, in this example we’ll use SlashID Access to handle authentication, but any Identity Provider (IdP) would be suitable.

gate:
  port: 8080
  log:
    format: text
    level: info

  plugins_http_cache:
    - pattern: '*'
      cache_control_override: private, max-age=600, stale-while-revalidate=300

  plugins:
    - id: authz_allow_if_authed_pii
      type: opa
      enabled: false
      intercept: response
      parameters:
        <<: *slashid_config
        policy_decision_path: /authz/allow
        policy: |
          package authz

          import future.keywords.if

          default allow := false

          key_found(obj, key) if { obj[key] }

          jwks_request := http.send({
              "cache": true,
              "method": "GET",
              "url": "https://api.slashid.com/.well-known/jwks.json"
          })
          valid_signature if io.jwt.verify_rs256(input.request.token, jwks_request.raw_body)

          allow if not key_found(input.response.http.headers, "X-Gate-Anonymize-1")
          allow if valid_signature

    - id: pii_anonymizer
      type: anonymizer
      enabled: false
      intercept: request_response
      parameters:
        anonymizers: |

          DEFAULT:
            type: keep
  urls:
    - pattern: '*/get_user'
      target: http://backend:8000
      plugins:
        pii_anonymizer:
          enabled: true
        authz_deny_pii:
          enabled: true

This rule is a bit more complicated, let’s see what happens step by step.

  1. First, we retrieve the JSON Web Key Set (JWKS) from https://api.slashid.com/.well-known/jwks.json.

  2. Later, we check that either the incoming authorization token has a valid RS256 signature signed by SlashID or that X-Gate-Anonymize-1 is not present.

  3. If either condition is true, the request is allowed. Let’s see this in action:

curl --verbose -L 'http://gate:8080/api/get_user/?email=test@example.com' | jq
* processing: http://gate:8080/api/get_user/?email=test@example.com
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 172.27.0.5:8080...
* Connected to gate (172.27.0.5) port 8080
> GET /api/get_user/?email=test@example.com HTTP/1.1
> Host: gate:8080
> User-Agent: curl/8.2.1
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Cache-Control: private, max-age=600, stale-while-revalidate=300
< Content-Length: 0
< Content-Type: application/json
< Date: Sat, 02 Sep 2023 16:04:24 GMT
< Server: uvicorn
< Via: 1.0 gate
< X-Gate-Anonymize-1: $.body.email 0 64 EMAIL_ADDRESS
<
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
* Connection #0 to host gate left intact

The request above is blocked because there is PII in the response and no valid JWT has been provided.

Let’s send a request with a valid token:

curl -H "Authorization: Bearer <TOKEN>" 'http://gate:8080/api/get_user/?email=test@example.com' | jq
{
  "email": "test@example.com",
  "id": 1,
  "name": "Test User"
}

Note in this case that we configured the PII plugin to alert of PII presence but not to replace or obfuscate it in any way, hence why we see the original clear-text response.

Depending on the IdP you are using, it is also possible to create more complex policies that not only check the validity of the identity token, but also examine specific properties of the token. (Look out for our next Gate blogpost for a deeper dive into this topic!)

Blocking requests to unknown URLs

More often than not, companies don’t really know which APIs are exposed to begin with. Gate can help in this scenario too.

Gate plugin instances can be applied to all routes, or you can select specific routes. In the example config below we enable the PII and OPA plugin instances on all routes and selectively disable them on specific routes:


gate:
  port: 8080
  log:
    format: text
    level: info

  plugins_http_cache:
    - pattern: "*"
      cache_control_override: private, max-age=600, stale-while-revalidate=300

  plugins:
    - id: authz_allow_if_authed_pii
      type: opa
      enabled: true
      intercept: response
      parameters:
        <<: *slashid_config
        policy_decision_path: /authz/allow
        policy: |
          package authz

          import future.keywords.if

          default allow := false

          key_found(obj, key) if { obj[key] }

          jwks_request := http.send({
              "cache": true,
              "method": "GET",
              "url": "https://api.slashid.com/.well-known/jwks.json"
          })
          valid_signature if io.jwt.verify_rs256(input.request.token, jwks_request.raw_body)

          allow if not key_found(input.response.http.headers, "X-Gate-Anonymize-1")
          allow if valid_signature
    - id: pii_anonymizer
      type: anonymizer
      enabled: true
      intercept: request_response
      parameters:
        anonymizers: |
          DEFAULT:
            type: keep

  urls:

    - pattern: "*/api/echo"
      target: http://backend:8000
      plugins:
        authz_allow_if_authed_pii:
          enabled: false
        pii_anonymizer:
          enabled: false

    - pattern: "*"
      target: http://backend:8000

Note how the plugins are defined as enabled by default and how in the URLs we explicitly disable the plugins on selected paths (e.g. "*/api/echo").

/usr/server/app $ curl --verbose -X POST 'http://gate:8080/api/echo' -d "email=abc@abc.com" | jq
Note: Unnecessary use of -X or --request, POST is already inferred.
* processing: http://gate:8080/api/echo
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 172.27.0.5:8080...
* Connected to gate (172.27.0.5) port 8080
> POST /api/echo HTTP/1.1
> Host: gate:8080
> User-Agent: curl/8.2.1
> Accept: */*
> Content-Length: 17
> Content-Type: application/x-www-form-urlencoded
>
} [17 bytes data]
< HTTP/1.1 200 OK
< Cache-Control: private, max-age=600, stale-while-revalidate=300
< Content-Length: 360
< Content-Type: application/json
< Date: Sun, 03 Sep 2023 09:30:38 GMT
< Server: uvicorn
< Via: 1.0 gate
<
{ [360 bytes data]
100   377  100   360  100    17  32933   1555 --:--:-- --:--:-- --:--:-- 37700
* Connection #0 to host gate left intact
{
  "method": "POST",
  "headers": {
    "host": "backend:8000",
    "user-agent": "curl/8.2.1",
    "content-length": "17",
    "accept": "*/*",
    "content-type": "application/x-www-form-urlencoded",
    "x-b3-sampled": "1",
    "x-b3-spanid": "39b9a26c103c6b5d",
    "x-b3-traceid": "ce0b56fc209ec47fbe0496606595c06b",
    "accept-encoding": "gzip"
  },
  "url": "http://backend:8000/api/echo",
  "body": {
    "email": "abc@abc.com"
  }
}
/usr/server/app $ curl --verbose 'http://gate:8080/api/get_user/?email=test@example.com' | jq
* processing: http://gate:8080/api/get_user/?email=test@example.com
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 172.27.0.5:8080...
* Connected to gate (172.27.0.5) port 8080
> GET /api/get_user/?email=test@example.com HTTP/1.1
> Host: gate:8080
> User-Agent: curl/8.2.1
> Accept: */*
>
< HTTP/1.1 403 Forbidden
< Cache-Control: private, max-age=600, stale-while-revalidate=300
< Content-Length: 0
< Content-Type: application/json
< Date: Sun, 03 Sep 2023 09:31:37 GMT
< Server: uvicorn
< Via: 1.0 gate
< X-Gate-Anonymize-1: $.body.email 0 16 EMAIL_ADDRESS
<
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0
* Connection #0 to host gate left intact
/usr/server/app $

Running in monitoring mode

Just like the PII detection plugin, the OPA plugin also supports monitoring mode by adding monitoring_mode: true in its parameters as shown below:

    - id: authz_allow_if_authed_pii
      type: opa
      enabled: true
      intercept: response
      parameters:
        <<: *slashid_config
        monitoring_mode: true
        policy_decision_path: /authz/allow
        policy: |
          package authz

          import future.keywords.if

          default allow := false

          key_found(obj, key) if { obj[key] }

          jwks_request := http.send({
              "cache": true,
              "method": "GET",
              "url": "https://api.slashid.com/.well-known/jwks.json"
          })
          valid_signature if io.jwt.verify_rs256(input.request.token, jwks_request.raw_body)

          allow if not key_found(input.response.http.headers, "X-Gate-Anonymize-1")
          allow if valid_signature

Let’s send a request with an invalid token:

curl -H "Authorization: Bearer abc" 'http://gate:8080/api/get_user/?email=test@example.com' | jq
{
  "email": "test@example.com",
  "id": 1,
  "name": "Test User"
}

The request passes but Gate logs the policy violation:

gate-demo-gate-1        | time=2023-09-04T13:37:06Z level=info msg=OPA decision: false decision_id=d9b20a8d-da43-4786-ae15-1ec91199786d decision_provenance={0.55.0 19fc439d01c8d667b128606390ad2cb9ded04b16-dirty 2023-09-02T15:18:29Z   map[gate:{}]} plugin=opa req_path=/api/get_user/ request_host=gate:8080 request_url=/api/get_user/?email=test%40example.com

Performance

Performance is key when intercepting and modifying network traffic, our plugins were built for high performance in mind. For instance we embed an optimized version a rego interpreter vs standing up a separate OPA server.

Let’s look at a simple benchmark to see the impact of the two plugins on the network traffic.

Here’s a simple benchmarking script:

#!/bin/sh
iterations=$1
url=$2

echo "Running $iterations iterations for curl $url"
totaltime=0.0

for run in $(seq 1 $iterations)
do
 time=$(curl $url \
    -s -o /dev/null -w "%{time_total}")
 totaltime=$(echo "$totaltime" + "$time" | bc)
done

avgtimeMs=$(echo "scale=4; 1000*$totaltime/$iterations" | bc)

echo "Averaged $avgtimeMs ms in $iterations iterations"

In our demo, a request without any interception results in the following:

/usr/server/app $ ./benchmark.sh 10000 'http://gate:8080/api/get_user/?email=test@example.com'
Running 10000 iterations for curl http://gate:8080/api/get_user/?email=test@example.com
Averaged 1.1820 ms in 10000 iterations
/usr/server/app $

When we enable PII detection and rewriting (hashing of the email address) coupled with our caching plugin:

/usr/server/app $ ./benchmark.sh 10000 'http://gate:8080/api/get_user/?email=test@example.com'
Running 10000 iterations for curl http://gate:8080/api/get_user/?email=test@example.com
Averaged 1.5955 ms in 10000 iterations
/usr/server/app $

Next, we test PII detection in monitoring mode:

/usr/server/app $ ./benchmark.sh 10000 'http://gate:8080/api/get_user/?email=test@example.com'
Running 10000 iterations for curl http://gate:8080/api/get_user/?email=test@example.com
Averaged 1.5176 ms in 10000 iterations
/usr/server/app $

Last, let’s run PII detection in monitoring mode coupled with OPA like we did in the example in the previous section:

/usr/server/app $ ./benchmark.sh 10000 'http://gate:8080/api/get_user/?email=test@example.com'
Running 10000 iterations for curl http://gate:8080/api/get_user/?email=test@example.com
Averaged 1.8532 ms in 10000 iterations
/usr/server/app $

Thanks to a combination of our caching plugin and Gate’s own architecture, the average overhead in our toy application is 0.6712 ms when both OPA and PII detections are turned on.

Conclusion

In this blogpost we’ve shown how you can combine the Gate PII and OPA plugins to easily detect and prevent PII leakage.

We’d love to hear any feedback you may have! Try out Gate with a free account. If you’d like to use the PII detection plugin, please contact us at at contact@slashid.dev!

Vincenzo Iozzo

Vincenzo Iozzo

Share

Twitter Linkedin Facebook

Related articles

No-code anti-phishing protection of internal apps with Passkeys
Tutorial/18 Sep, 2023

No-code anti-phishing protection of internal apps with Passkeys

Phishing is one of the most common causes of data breaches. According to Verizon's DBIR report, over 50% of incidents start with phishing or stolen credentials. WebAuthn/Passkeys are an effective way to stop phishing and credential stealing attempts on their tracks.

Vincenzo Iozzo
Vincenzo Iozzo
Firewalling OpenAI APIs: Data loss prevention and identity access control
Tutorial/14 Sep, 2023

Firewalling OpenAI APIs: Data loss prevention and identity access control

Large Language Models (LLMs) have taken the world by storm, and they are now used for many tasks by consumers and enterprises alike. However, the risk of accidentally disclosing sensitive data to the models is very high as the recent Samsung case shown.

Vincenzo Iozzo
Vincenzo Iozzo
Using Google Tink to sign JWTs with ECDSA
Tutorial/20 Feb, 2023

Using Google Tink to sign JWTs with ECDSA

In this blog post, we will show how the Tink cryptography library can be used to create, sign, and verify JSON Web Tokens (JWTs), as well as to manage the cryptographic keys for doing so.

Joseph Gardner
Joseph Gardner
SlashID/Identity at scale
Offload complexity •••••••••••••

Onboard users

Move authentication to the edge.

Get started

© 2023 SlashID® Inc.

All Rights Reserved.

Terms · Privacy · System Status

Products

Gate Access Data Vault

SlashID

Blog Careers Talk to us

Developers

Get started Documentation System Status Security

Gallery

E-commerce Multi-tenancy

Social

Twitter Linkedin

We use cookies to improve your experience. Read our cookie policy.