Skip to content

Conversation

@yronglin
Copy link
Contributor

This PR reapply #107168.

@yronglin yronglin requested a review from Endilll as a code owner December 20, 2025 02:14
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules labels Dec 20, 2025
@llvmbot
Copy link
Member

llvmbot commented Dec 20, 2025

@llvm/pr-subscribers-clang-modules

@llvm/pr-subscribers-clang

Author: None (yronglin)

Changes

This PR reapply #107168.


Patch is 141.89 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/173130.diff

44 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+1)
  • (modified) clang/docs/StandardCPlusPlusModules.rst (-27)
  • (modified) clang/include/clang/Basic/DiagnosticLexKinds.td (+17-2)
  • (modified) clang/include/clang/Basic/DiagnosticParseKinds.td (+6-4)
  • (modified) clang/include/clang/Basic/IdentifierTable.h (+28-10)
  • (modified) clang/include/clang/Basic/TokenKinds.def (+8)
  • (modified) clang/include/clang/Basic/TokenKinds.h (+4)
  • (modified) clang/include/clang/Frontend/CompilerInstance.h (+1-1)
  • (modified) clang/include/clang/Lex/CodeCompletionHandler.h (+8)
  • (modified) clang/include/clang/Lex/DependencyDirectivesScanner.h (+16)
  • (modified) clang/include/clang/Lex/ModuleLoader.h (+1)
  • (modified) clang/include/clang/Lex/Preprocessor.h (+135-61)
  • (modified) clang/include/clang/Lex/Token.h (+5)
  • (modified) clang/include/clang/Lex/TokenLexer.h (+12)
  • (modified) clang/include/clang/Parse/Parser.h (+5-4)
  • (modified) clang/lib/Basic/IdentifierTable.cpp (+14-2)
  • (modified) clang/lib/Basic/TokenKinds.cpp (+12)
  • (modified) clang/lib/DependencyScanning/ModuleDepCollector.cpp (+2-1)
  • (modified) clang/lib/Frontend/CompilerInstance.cpp (+8-4)
  • (modified) clang/lib/Frontend/InitPreprocessor.cpp (+7)
  • (modified) clang/lib/Frontend/PrintPreprocessedOutput.cpp (+32-1)
  • (modified) clang/lib/Lex/DependencyDirectivesScanner.cpp (+149-15)
  • (modified) clang/lib/Lex/Lexer.cpp (+50-10)
  • (modified) clang/lib/Lex/PPDirectives.cpp (+439-31)
  • (modified) clang/lib/Lex/Preprocessor.cpp (+275-209)
  • (modified) clang/lib/Lex/TokenConcatenation.cpp (+5-3)
  • (modified) clang/lib/Lex/TokenLexer.cpp (+27-1)
  • (modified) clang/lib/Parse/Parser.cpp (+47-77)
  • (modified) clang/lib/Sema/SemaModule.cpp (+7-24)
  • (modified) clang/test/CXX/basic/basic.link/p3.cpp (+7-8)
  • (modified) clang/test/CXX/basic/basic.scope/basic.scope.namespace/p2.cpp (+1-1)
  • (added) clang/test/CXX/drs/cwg2947.cpp (+81)
  • (modified) clang/test/CXX/lex/lex.pptoken/p3-2a.cpp (+19-14)
  • (modified) clang/test/CXX/module/basic/basic.link/module-declaration.cpp (+2-2)
  • (added) clang/test/CXX/module/cpp.pre/p1.cpp (+207)
  • (modified) clang/test/CXX/module/dcl.dcl/dcl.module/dcl.module.import/p1.cppm (+4-4)
  • (added) clang/test/Lexer/cxx20-module-directive.cpp (+11)
  • (modified) clang/test/Modules/pr121066.cpp (+3-1)
  • (modified) clang/test/Modules/preprocess-named-modules.cppm (+1-1)
  • (modified) clang/unittests/ASTMatchers/ASTMatchersNodeTest.cpp (+2-1)
  • (modified) clang/unittests/Lex/DependencyDirectivesScannerTest.cpp (+43-5)
  • (modified) clang/unittests/Lex/ModuleDeclStateTest.cpp (+1-1)
  • (modified) clang/www/cxx_dr_status.html (+1-1)
  • (modified) clang/www/cxx_status.html (+2-9)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 994ac444d4aa1..abc6dab2f0614 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -207,6 +207,7 @@ C++20 Feature Support
 - Clang now normalizes constraints before checking whether they are satisfied, as mandated by the standard.
   As a result, Clang no longer incorrectly diagnoses substitution failures in template arguments only
   used in concept-ids, and produces better diagnostics for satisfaction failure. (#GH61811) (#GH135190)
+- Clang now supports `P1857R3 <https://wg21.link/p1857r3>`_ Modules Dependency Discovery. (#GH54047)
 
 C++17 Feature Support
 ^^^^^^^^^^^^^^^^^^^^^
diff --git a/clang/docs/StandardCPlusPlusModules.rst b/clang/docs/StandardCPlusPlusModules.rst
index 71988d0fced98..f6ab17ede46fa 100644
--- a/clang/docs/StandardCPlusPlusModules.rst
+++ b/clang/docs/StandardCPlusPlusModules.rst
@@ -1384,33 +1384,6 @@ declarations which use it. Thus, the preferred name will not be displayed in
 the debugger as expected. This is tracked by
 `#56490 <https://github.com/llvm/llvm-project/issues/56490>`_.
 
-Don't emit macros about module declaration
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-This is covered by `P1857R3 <https://wg21.link/P1857R3>`_. It is mentioned here
-because we want users to be aware that we don't yet implement it.
-
-A direct approach to write code that can be compiled by both modules and
-non-module builds may look like:
-
-.. code-block:: c++
-
-  MODULE
-  IMPORT header_name
-  EXPORT_MODULE MODULE_NAME;
-  IMPORT header_name
-  EXPORT ...
-
-The intent of this is that this file can be compiled like a module unit or a
-non-module unit depending on the definition of some macros. However, this usage
-is forbidden by P1857R3 which is not yet implemented in Clang. This means that
-is possible to write invalid modules which will no longer be accepted once
-P1857R3 is implemented. This is tracked by
-`#54047 <https://github.com/llvm/llvm-project/issues/54047>`_.
-
-Until then, it is recommended not to mix macros with module declarations.
-
-
 Inconsistent filename suffix requirement for importable module units
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
diff --git a/clang/include/clang/Basic/DiagnosticLexKinds.td b/clang/include/clang/Basic/DiagnosticLexKinds.td
index a72d3f37b1b72..77feea9f869e9 100644
--- a/clang/include/clang/Basic/DiagnosticLexKinds.td
+++ b/clang/include/clang/Basic/DiagnosticLexKinds.td
@@ -503,8 +503,8 @@ def warn_cxx98_compat_variadic_macro : Warning<
   InGroup<CXX98CompatPedantic>, DefaultIgnore;
 def ext_named_variadic_macro : Extension<
   "named variadic macros are a GNU extension">, InGroup<VariadicMacros>;
-def err_embedded_directive : Error<
-  "embedding a #%0 directive within macro arguments is not supported">;
+def err_embedded_directive : Error<"embedding a %select{#|C++ }0%1 directive "
+                                   "within macro arguments is not supported">;
 def ext_embedded_directive : Extension<
   "embedding a directive within macro arguments has undefined behavior">,
   InGroup<DiagGroup<"embedded-directive">>;
@@ -998,6 +998,21 @@ def warn_module_conflict : Warning<
   InGroup<ModuleConflict>;
 
 // C++20 modules
+def err_pp_module_name_is_macro : Error<
+  "%select{module|partition}0 name component %1 cannot be a object-like macro">;
+def err_pp_module_expected_ident : Error<
+  "expected %select{identifier after '.' in |}0module name">;
+def err_pp_unexpected_tok_after_module_name : Error<
+  "unexpected preprocessing token '%0' after module name, "
+  "only ';' and '[' (start of attribute specifier sequence) are allowed">;
+def warn_pp_extra_tokens_at_module_directive_eol
+    : Warning<"extra tokens after semicolon in '%0' directive">,
+      InGroup<ExtraTokens>;
+def err_pp_module_decl_in_header
+    : Error<"module declaration must not come from an #include directive">;
+def err_pp_cond_span_module_decl
+    : Error<"module directive lines are not allowed on lines controlled "
+    "by preprocessor conditionals">;
 def err_header_import_semi_in_macro : Error<
   "semicolon terminating header import declaration cannot be produced "
   "by a macro">;
diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td b/clang/include/clang/Basic/DiagnosticParseKinds.td
index 662fe16d965b6..83d4ce3ca278c 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1792,10 +1792,8 @@ def ext_bit_int : Extension<
 } // end of Parse Issue category.
 
 let CategoryName = "Modules Issue" in {
-def err_unexpected_module_decl : Error<
-  "module declaration can only appear at the top level">;
-def err_module_expected_ident : Error<
-  "expected a module name after '%select{module|import}0'">;
+def err_unexpected_module_or_import_decl : Error<
+  "%select{module|import}0 declaration can only appear at the top level">;
 def err_attribute_not_module_attr : Error<
   "%0 attribute cannot be applied to a module">;
 def err_keyword_not_module_attr : Error<
@@ -1806,6 +1804,10 @@ def err_keyword_not_import_attr : Error<
   "%0 cannot be applied to a module import">;
 def err_module_expected_semi : Error<
   "expected ';' after module name">;
+def err_expected_semi_after_module_or_import
+  : Error<"%0 directive must end with a ';'">;
+def note_module_declared_here : Note<
+  "%select{module|import}0 directive defined here">;
 def err_global_module_introducer_not_at_start : Error<
   "'module;' introducing a global module fragment can appear only "
   "at the start of the translation unit">;
diff --git a/clang/include/clang/Basic/IdentifierTable.h b/clang/include/clang/Basic/IdentifierTable.h
index 043c184323876..1131727ed23ee 100644
--- a/clang/include/clang/Basic/IdentifierTable.h
+++ b/clang/include/clang/Basic/IdentifierTable.h
@@ -231,6 +231,10 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
   LLVM_PREFERRED_TYPE(bool)
   unsigned IsModulesImport : 1;
 
+  // True if this is the 'module' contextual keyword.
+  LLVM_PREFERRED_TYPE(bool)
+  unsigned IsModulesDecl : 1;
+
   // True if this is a mangled OpenMP variant name.
   LLVM_PREFERRED_TYPE(bool)
   unsigned IsMangledOpenMPVariantName : 1;
@@ -267,8 +271,9 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
         IsCPPOperatorKeyword(false), NeedsHandleIdentifier(false),
         IsFromAST(false), ChangedAfterLoad(false), FEChangedAfterLoad(false),
         RevertedTokenID(false), OutOfDate(false), IsModulesImport(false),
-        IsMangledOpenMPVariantName(false), IsDeprecatedMacro(false),
-        IsRestrictExpansion(false), IsFinal(false), IsKeywordInCpp(false) {}
+        IsModulesDecl(false), IsMangledOpenMPVariantName(false),
+        IsDeprecatedMacro(false), IsRestrictExpansion(false), IsFinal(false),
+        IsKeywordInCpp(false) {}
 
 public:
   IdentifierInfo(const IdentifierInfo &) = delete;
@@ -569,12 +574,24 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
   }
 
   /// Determine whether this is the contextual keyword \c import.
-  bool isModulesImport() const { return IsModulesImport; }
+  bool isImportKeyword() const { return IsModulesImport; }
 
   /// Set whether this identifier is the contextual keyword \c import.
-  void setModulesImport(bool I) {
-    IsModulesImport = I;
-    if (I)
+  void setKeywordImport(bool Val) {
+    IsModulesImport = Val;
+    if (Val)
+      NeedsHandleIdentifier = true;
+    else
+      RecomputeNeedsHandleIdentifier();
+  }
+
+  /// Determine whether this is the contextual keyword \c module.
+  bool isModuleKeyword() const { return IsModulesDecl; }
+
+  /// Set whether this identifier is the contextual keyword \c module.
+  void setModuleKeyword(bool Val) {
+    IsModulesDecl = Val;
+    if (Val)
       NeedsHandleIdentifier = true;
     else
       RecomputeNeedsHandleIdentifier();
@@ -629,7 +646,7 @@ class alignas(IdentifierInfoAlignment) IdentifierInfo {
   void RecomputeNeedsHandleIdentifier() {
     NeedsHandleIdentifier = isPoisoned() || hasMacroDefinition() ||
                             isExtensionToken() || isFutureCompatKeyword() ||
-                            isOutOfDate() || isModulesImport();
+                            isOutOfDate() || isImportKeyword();
   }
 };
 
@@ -797,10 +814,11 @@ class IdentifierTable {
     // contents.
     II->Entry = &Entry;
 
-    // If this is the 'import' contextual keyword, mark it as such.
+    // If this is the 'import' or 'module' contextual keyword, mark it as such.
     if (Name == "import")
-      II->setModulesImport(true);
-
+      II->setKeywordImport(true);
+    else if (Name == "module")
+      II->setModuleKeyword(true);
     return *II;
   }
 
diff --git a/clang/include/clang/Basic/TokenKinds.def b/clang/include/clang/Basic/TokenKinds.def
index 3d955095b07a8..a3d286fdb81a7 100644
--- a/clang/include/clang/Basic/TokenKinds.def
+++ b/clang/include/clang/Basic/TokenKinds.def
@@ -133,6 +133,11 @@ PPKEYWORD(pragma)
 // C23 & C++26 #embed
 PPKEYWORD(embed)
 
+// C++20 Module Directive
+PPKEYWORD(module)
+PPKEYWORD(__preprocessed_module)
+PPKEYWORD(__preprocessed_import)
+
 // GNU Extensions.
 PPKEYWORD(import)
 PPKEYWORD(include_next)
@@ -1030,6 +1035,9 @@ ANNOTATION(module_include)
 ANNOTATION(module_begin)
 ANNOTATION(module_end)
 
+// Annotations for C++, Clang and Objective-C named modules.
+ANNOTATION(module_name)
+
 // Annotation for a header_name token that has been looked up and transformed
 // into the name of a header unit.
 ANNOTATION(header_unit)
diff --git a/clang/include/clang/Basic/TokenKinds.h b/clang/include/clang/Basic/TokenKinds.h
index a801113c57715..c0316257d9d97 100644
--- a/clang/include/clang/Basic/TokenKinds.h
+++ b/clang/include/clang/Basic/TokenKinds.h
@@ -76,6 +76,10 @@ const char *getPunctuatorSpelling(TokenKind Kind) LLVM_READNONE;
 /// tokens like 'int' and 'dynamic_cast'. Returns NULL for other token kinds.
 const char *getKeywordSpelling(TokenKind Kind) LLVM_READNONE;
 
+/// Determines the spelling of simple Objective-C keyword tokens like '@import'.
+/// Returns NULL for other token kinds.
+const char *getObjCKeywordSpelling(ObjCKeywordKind Kind) LLVM_READNONE;
+
 /// Returns the spelling of preprocessor keywords, such as "else".
 const char *getPPKeywordSpelling(PPKeywordKind Kind) LLVM_READNONE;
 
diff --git a/clang/include/clang/Frontend/CompilerInstance.h b/clang/include/clang/Frontend/CompilerInstance.h
index a8e8461b9b5a9..42ef3ea7b355a 100644
--- a/clang/include/clang/Frontend/CompilerInstance.h
+++ b/clang/include/clang/Frontend/CompilerInstance.h
@@ -893,7 +893,7 @@ class CompilerInstance : public ModuleLoader {
   /// load it.
   ModuleLoadResult findOrCompileModuleAndReadAST(StringRef ModuleName,
                                                  SourceLocation ImportLoc,
-                                                 SourceLocation ModuleNameLoc,
+                                                 SourceRange ModuleNameRange,
                                                  bool IsInclusionDirective);
 
   /// Creates a \c CompilerInstance for compiling a module.
diff --git a/clang/include/clang/Lex/CodeCompletionHandler.h b/clang/include/clang/Lex/CodeCompletionHandler.h
index bd3e05a36bb33..2ef29743415ae 100644
--- a/clang/include/clang/Lex/CodeCompletionHandler.h
+++ b/clang/include/clang/Lex/CodeCompletionHandler.h
@@ -13,12 +13,15 @@
 #ifndef LLVM_CLANG_LEX_CODECOMPLETIONHANDLER_H
 #define LLVM_CLANG_LEX_CODECOMPLETIONHANDLER_H
 
+#include "clang/Basic/IdentifierTable.h"
+#include "clang/Basic/SourceLocation.h"
 #include "llvm/ADT/StringRef.h"
 
 namespace clang {
 
 class IdentifierInfo;
 class MacroInfo;
+using ModuleIdPath = ArrayRef<IdentifierLoc>;
 
 /// Callback handler that receives notifications when performing code
 /// completion within the preprocessor.
@@ -70,6 +73,11 @@ class CodeCompletionHandler {
   /// file where we expect natural language, e.g., a comment, string, or
   /// \#error directive.
   virtual void CodeCompleteNaturalLanguage() { }
+
+  /// Callback invoked when performing code completion inside the module name
+  /// part of an import directive.
+  virtual void CodeCompleteModuleImport(SourceLocation ImportLoc,
+                                        ModuleIdPath Path) {}
 };
 
 }
diff --git a/clang/include/clang/Lex/DependencyDirectivesScanner.h b/clang/include/clang/Lex/DependencyDirectivesScanner.h
index f9fec3998ca53..b21da166a96e5 100644
--- a/clang/include/clang/Lex/DependencyDirectivesScanner.h
+++ b/clang/include/clang/Lex/DependencyDirectivesScanner.h
@@ -135,6 +135,22 @@ void printDependencyDirectivesAsSource(
     ArrayRef<dependency_directives_scan::Directive> Directives,
     llvm::raw_ostream &OS);
 
+/// Scan an input source buffer for C++20 named module usage.
+///
+/// \param Source The input source buffer.
+///
+/// \returns true if any C++20 named modules related directive was found.
+bool scanInputForCXX20ModulesUsage(StringRef Source);
+
+/// Scan an input source buffer, and check whether the input source is a
+/// preprocessed output.
+///
+/// \param Source The input source buffer.
+///
+/// \returns true if any '__preprocessed_module' or '__preprocessed_import'
+/// directive was found.
+bool isPreprocessedModuleFile(StringRef Source);
+
 /// Functor that returns the dependency directives for a given file.
 class DependencyDirectivesGetter {
 public:
diff --git a/clang/include/clang/Lex/ModuleLoader.h b/clang/include/clang/Lex/ModuleLoader.h
index a58407200c41c..042a5ab1f4a57 100644
--- a/clang/include/clang/Lex/ModuleLoader.h
+++ b/clang/include/clang/Lex/ModuleLoader.h
@@ -159,6 +159,7 @@ class ModuleLoader {
   /// \returns Returns true if any modules with that symbol found.
   virtual bool lookupMissingImports(StringRef Name,
                                     SourceLocation TriggerLoc) = 0;
+  static std::string getFlatNameFromPath(ModuleIdPath Path);
 
   bool HadFatalFailure = false;
 };
diff --git a/clang/include/clang/Lex/Preprocessor.h b/clang/include/clang/Lex/Preprocessor.h
index b1c648e647f41..c8356b1dd45e4 100644
--- a/clang/include/clang/Lex/Preprocessor.h
+++ b/clang/include/clang/Lex/Preprocessor.h
@@ -48,6 +48,7 @@
 #include "llvm/Support/Allocator.h"
 #include "llvm/Support/Casting.h"
 #include "llvm/Support/Registry.h"
+#include "llvm/Support/TrailingObjects.h"
 #include <cassert>
 #include <cstddef>
 #include <cstdint>
@@ -136,6 +137,64 @@ struct CXXStandardLibraryVersionInfo {
   std::uint64_t Version;
 };
 
+/// Record the previous 'export' keyword info.
+///
+/// Since P1857R3, the standard introduced several rules to determine whether
+/// the 'module', 'export module', 'import', 'export import' is a valid
+/// directive introducer. This class is used to record the previous 'export'
+/// keyword token, and then handle 'export module' and 'export import'.
+class ExportContextualKeywordInfo {
+  Token ExportTok;
+  bool AtPhysicalStartOfLine = false;
+
+public:
+  ExportContextualKeywordInfo() = default;
+  ExportContextualKeywordInfo(const Token &Tok, bool AtPhysicalStartOfLine)
+      : ExportTok(Tok), AtPhysicalStartOfLine(AtPhysicalStartOfLine) {}
+
+  bool isValid() const { return ExportTok.is(tok::kw_export); }
+  bool isAtPhysicalStartOfLine() const { return AtPhysicalStartOfLine; }
+  Token getExportTok() const { return ExportTok; }
+  void reset() {
+    ExportTok.startToken();
+    AtPhysicalStartOfLine = false;
+  }
+};
+
+class ModuleNameLoc final
+    : llvm::TrailingObjects<ModuleNameLoc, IdentifierLoc> {
+  friend TrailingObjects;
+  unsigned NumIdentifierLocs;
+  unsigned numTrailingObjects(OverloadToken<IdentifierLoc>) const {
+    return getNumIdentifierLocs();
+  }
+
+  ModuleNameLoc(ModuleIdPath Path) : NumIdentifierLocs(Path.size()) {
+    (void)llvm::copy(Path, getTrailingObjectsNonStrict<IdentifierLoc>());
+  }
+
+public:
+  static ModuleNameLoc *Create(Preprocessor &PP, ModuleIdPath Path);
+  unsigned getNumIdentifierLocs() const { return NumIdentifierLocs; }
+  ModuleIdPath getModuleIdPath() const {
+    return {getTrailingObjectsNonStrict<IdentifierLoc>(),
+            getNumIdentifierLocs()};
+  }
+
+  SourceLocation getBeginLoc() const {
+    return getModuleIdPath().front().getLoc();
+  }
+  SourceLocation getEndLoc() const {
+    auto &Last = getModuleIdPath().back();
+    return Last.getLoc().getLocWithOffset(
+        Last.getIdentifierInfo()->getLength());
+  }
+  SourceRange getRange() const { return {getBeginLoc(), getEndLoc()}; }
+  std::string str() const {
+    return ModuleLoader::getFlatNameFromPath(getModuleIdPath());
+  }
+};
+
 /// Engages in a tight little dance with the lexer to efficiently
 /// preprocess tokens.
 ///
@@ -339,8 +398,9 @@ class Preprocessor {
   /// lexed, if any.
   SourceLocation ModuleImportLoc;
 
-  /// The import path for named module that we're currently processing.
-  SmallVector<IdentifierLoc, 2> NamedModuleImportPath;
+  /// The source location of the \c module contextual keyword we just
+  /// lexed, if any.
+  SourceLocation ModuleDeclLoc;
 
   llvm::DenseMap<FileID, SmallVector<const char *>> CheckPoints;
   unsigned CheckPointCounter = 0;
@@ -351,6 +411,12 @@ class Preprocessor {
   /// Whether the last token we lexed was an '@'.
   bool LastTokenWasAt = false;
 
+  /// Whether we're importing a standard C++20 named Modules.
+  bool ImportingCXXNamedModules = false;
+
+  /// Whether the last token we lexed was an 'export' keyword.
+  ExportContextualKeywordInfo LastTokenWasExportKeyword;
+
   /// First pp-token source location in current translation unit.
   SourceLocation FirstPPTokenLoc;
 
@@ -562,9 +628,9 @@ class Preprocessor {
         reset();
     }
 
-    void handleIdentifier(IdentifierInfo *Identifier) {
-      if (isModuleCandidate() && Identifier)
-        Name += Identifier->getName().str();
+    void handleModuleName(ModuleNameLoc *NameLoc) {
+      if (isModuleCandidate() && NameLoc)
+        Name += NameLoc->str();
       else if (!isNamedModule())
         reset();
     }
@@ -576,13 +642,6 @@ class Preprocessor {
         reset();
     }
 
-    void handlePeriod() {
-      if (isModuleCandidate())
-        Name += ".";
-      else if (!isNamedModule())
-        reset();
-    }
-
     void handleSemi() {
       if (!Name.empty() && isModuleCandidate()) {
         if (State == InterfaceCandidate)
@@ -639,10 +698,6 @@ class Preprocessor {
 
   ModuleDeclSeq ModuleDeclState;
 
-  /// Whether the module import expects an identifier next. Otherwise,
-  /// it expects a '.' or ';'.
-  bool ModuleImportExpectsIdentifier = false;
-
   /// The identifier and source location of the currently-active
   /// \#pragma clang arc_cf_code_audited begin.
   IdentifierLoc PragmaARCCFCodeAuditedInfo;
@@ -1125,6 +1180,9 @@ class Preprocessor {
   /// Whether tokens are being skipped until the through header is seen.
   bool SkippingUntilPCHThroughHeader = false;
 
+  /// Whether the main file is preprocessed module file.
+  bool MainFileIsPreprocessedModuleFile = false;
+
   /// \{
   /// Cache of macro expanders to reduce malloc traffic.
   enum { TokenLexerCacheSize = 8 };
@@ -1778,6 +1836,36 @@ class Preprocessor {
   std::optional<LexEmbedParametersResult> LexEmbedParameters(Token &Current,
                                                              bool ForHasEmbed);
 
+  /// Whether the main file is preprocessed module file.
+  bool isPreprocessedModuleFile() const {
+    return MainFileIsPreprocessedModuleFile;
+  }
+
+  /// Mark the main file as a preprocessed module file, then the 'module' and
+  /// 'import' directive recognition will be suppressed. Only
+  /// '__preprocessed_moduke' and '__preprocessed_import' are allowed.
+  void markMainFileAsPreprocessedModuleFile() {
+    MainFileIsPreprocessedModuleFile = true;
+  }
+
+  bool LexModuleNameContinue(Token &Tok, SourceLocation UseLoc,
+                             SmallVectorImpl<Token> &Suffix,
+                             SmallVectorImpl<IdentifierLoc> &Path,
+                             bool AllowMacroExpansion = true,
+                             bool IsPartition = false);
+  void EnterModuleSuffixTokenStream(ArrayRef<Token> Toks);
+  void HandleCXXImportDirective(Token Import);
+  void HandleCXXModuleDirective(Token Module);
+
+  /// Callback invoked when the lexer sees one of export, import or module token
+  /// at the start of a line.
+  ///
+  /// This consumes the...
[truncated]

@yronglin
Copy link
Contributor Author

@ilovepi Could you help verify whether this branch can resolve this crash issue? Many thanks!

@aemerson
Copy link
Contributor

Please see my (request, not a demand) to hold off on this until NY: #107168 (comment)

@cor3ntin
Copy link
Contributor

@aemerson are you find landing this if @yronglin watches over the bots?
We really should aim to have this in 22 (it's been worked on for a very long time and is somewhat important for module tooling) - and so the sooner we find potential issues the better.

@yronglin Did you run msan stage 2 builds locally?

@aemerson
Copy link
Contributor

@aemerson are you find landing this if @yronglin watches over the bots? We really should aim to have this in 22 (it's been worked on for a very long time and is somewhat important for module tooling) - and so the sooner we find potential issues the better.

You really don't need my approval to reland, it wasn't my intention was to act as a blocker. My point was just that the people who watch over more extensive testing of trunk may not be around to report issues until NY. That said please go ahead if you think it's better to get early signal 🙂

@yronglin
Copy link
Contributor Author

@yronglin Did you run msan stage 2 builds locally?

Yes, I can reproduce locally. I try to build a debug & msan build to investigating the root cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:modules C++20 modules and Clang Header Modules clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants