[Java] Clarification Needed on DataFlow::FeatureHasSourceCallContext Behavior #21097
-
|
Hi, To illustrate, I've reproduced the issue in Java: I'm tracking a public void source(String taint) {
return randomMethod(taint)
}
public Object randomMethod(String taint) {
return taint
}
public void otherMethod(String param) {
Object obj = randomMethod(taint)
// continue the tracking / flow
} I initially thought that I feel I'm not the only one struggling with this issue: #17241 I'm surprised by this kind of result. I would like a way to ensure that my path goes from source to sink and that it can be called in my program. Thank you |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 11 replies
-
|
Hi 👋 Can you share the query that you used to track flow? I'm surprised that you see flow going from As for flow features, they are used to restrict which kind of flow is allowed at sources/sinks:
Flow features are rarely needed in practice, and if the underlying problem is the erroneous flow mentioned above, then it sounds more like a bug in the data flow library. |
Beta Was this translation helpful? Give feedback.
-
|
@hvitved I have extracted the relevant code from my codebase and created a query to show you. So here is my Java code, I have basically copy/paste the weird function which is package com.test;
import java.util.Locale;
public class StringUtils {
private static final int TO_UPPER_CACHE_LENGTH = 2048;
private static final int TO_UPPER_CACHE_MAX_ENTRY_LENGTH = 64;
private static final String[][] TO_UPPER_CACHE = new String[TO_UPPER_CACHE_LENGTH][];
private StringUtils() {
}
public static String toUpperEnglish(String var0) {
if (var0 == null) return null;
if (var0.length() > TO_UPPER_CACHE_MAX_ENTRY_LENGTH) {
return var0.toUpperCase(Locale.ENGLISH);
} else {
int var1 = var0.hashCode() & (TO_UPPER_CACHE_LENGTH - 1);
String[] var2 = TO_UPPER_CACHE[var1];
if (var2 != null && var2[0].equals(var0)) {
return var2[1];
} else {
String var3 = var0.toUpperCase(Locale.ENGLISH);
var2 = new String[]{var0, var3};
TO_UPPER_CACHE[var1] = var2;
return var3;
}
}
}
}
class Sourceclass {
public Object sourceMethod(Object source) {
return (String) Sinkclass.sinkMethod(StringUtils.toUpperEnglish((String) source));
}
}
class Other {
private static String randomMethod(Object var1) {
return (String) Sinkclass.sinkMethod(StringUtils.toUpperEnglish((String) var1));
}
}
class Sinkclass {
public static Object sinkMethod(Object source) {
return source;
}
}Note the Here is my CodeQL query: /**
* @id test
* @description test
* @name test
* @kind path-problem
* @problem.severity warning
* @tags security
*/
import java
import semmle.code.java.dataflow.FlowSources
import semmle.code.java.security.CommandLineQuery
class SourceMethod extends Method {
SourceMethod() { this.getName() = "sourceMethod" }
}
module MyConfig implements DataFlow::ConfigSig {
predicate isSource(DataFlow::Node source) {
source.asParameter().getCallable() instanceof SourceMethod
}
predicate isSink(DataFlow::Node sink) {
exists(Call c |
c.getCallee().hasQualifiedName("com.test", "Sinkclass", "sinkMethod") and
sink.asExpr() = c.getAnArgument()
)
}
}
module MyFlow = TaintTracking::Global<MyConfig>;
import MyFlow::PathGraph
from MyFlow::PathNode source, MyFlow::PathNode sink
where MyFlow::flowPath(source, sink)
select sink.getNode(), source, sink, "sink: $@", source.getNode(), sink.toStringWithContext()For this I get 2 paths, the one I'm looking for:
but also this one:
|
Beta Was this translation helpful? Give feedback.


That looks completely correct to me. If flow reaches a static variable then surely that same static variable can be read from other unrelated calls to
toUpperEnglish- storing into and reading from a static variable effectively throws away the call context. So even though this may be considered a false positive, it's working as intended. Data flow cannot distinguish between individual array indices here, so as soon as one entry in the cache is tainted, then they're all considered tainted.